Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ JupyterLab Image with GLIBC V2.32 - To Enable to Run Large Language Models Locally #3041

Closed
foufoulides opened this issue Jan 19, 2024 · 6 comments
Assignees
Labels
data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools enhancement enhancing an existing feature feature-request

Comments

@foufoulides
Copy link

Describe the feature request.

I am currently in the Alpha Phase (which had a total of 6 weeks and now has 4 weeks remaining) with Digital and HMPPS where I am utilizing LLMs to allow prison staff to make prisoner Casenotes more discoverable.
I am currently working with dummy data that I created using the OpenAI API, but we need to test effectiveness of my work on a small number of real Casenotes, which are much messier than the dummy data. In order to eliminate any privacy concerns, I need to use LLMs locally, stored on an S3 bucket. These models are implemented in C and need GNU C Library (GLIBC) V2.32 to be read. Could I have an image of JupyterLab which has GLIBC 2.32 please and if possible with 16GB as some of the larger LLMs require that to run?

Describe the context.

I am currently in the Alpha Phase (which had a total of 6 weeks and now has 4 weeks remaining) with Digital and HMPPS where I am utilizing LLMs to allow prison staff to make prisoner Casenotes more discoverable.

Value / Purpose

To assess the effectiveness and of my work on real Casenotes (which are much messier) to see feasibility of the various approaches, which will impact our decisions for the Beta Phase of this project

User Types

There are 16 staff members working on this project from Prisons Data Science, HMPPS, and Digital.

@foufoulides foufoulides added enhancement enhancing an existing feature feature-request labels Jan 19, 2024
@Ed-Bajo Ed-Bajo moved this to 🧐 To Do in Analytical Platform Jan 30, 2024
@Ed-Bajo Ed-Bajo added data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools and removed data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools labels Jan 30, 2024
@jacobwoffenden
Copy link
Member

Removing from sprint 9 as this cannot be progressed until we explore Analytical Platform tooling images again

@foufoulides
Copy link
Author

foufoulides commented Feb 1, 2024

To add some more specific information on the Local LLMs I am trying to use on the AP for some small scale testing purposes:
I am using LangChain, which has options to Run LLMs Locally of c-implementations of popular open source LLMs. The most applicable choice from reading the documentation in the above link seemed to be GPT4All (which has nothing to do with OpenAI) which offers several such c-implemented of popular open source LLMs. If you scroll down to the Model Explorer section of the GPT4All page, you will see the options available. They are all in .gguf format (though I looked for versions online of the same models that were .bin format) GPT4All has switched recently to only .gguf format. Both scenarios gave me the same GNU C Compiler error and wanted version 2.32. Some of the larger models require the machine to have 16GB of RAM. Ideally we want to use the smallest model that does our work satisfactory, but we are not sure what that would be yet, so if possible it would be good to have the option to run the largest models as well.

@jacobwoffenden jacobwoffenden self-assigned this Feb 5, 2024
@jacobwoffenden jacobwoffenden moved this from 🧐 To Do to 💨 In Progress in Analytical Platform Feb 5, 2024
@jacobwoffenden
Copy link
Member

Hi @foufoulides!

We've just released a new version of Jupyter Lab to yourself and @lucypitches

You can see it on your Control Panel as v3.6.3, and it now has 2 CPUs and 16GB of RAM

Can you deploy it and let us know how you get on with testing

Thanks,

Jacob, @julialawrence, @michaeljcollinsuk

@jacobwoffenden jacobwoffenden moved this from 💨 In Progress to 👀 In Review in Analytical Platform Feb 5, 2024
@foufoulides
Copy link
Author

Hi Jacob,

Thank you so much for this! I will check it this morning and get back to you.

Best,
Chris

@jacobwoffenden
Copy link
Member

Feedback from @foufoulides is that the LLM is working 🎉 Closing as complete!

@jacobwoffenden jacobwoffenden moved this from 👀 In Review to 🎉 Done in Analytical Platform Feb 8, 2024
@lucypitches
Copy link

Hello, I've just deployed the new version. However, when I try install packages to my venv (python3 -m pip install -r requirements.txt) I get the following error: /home/jovyan/prison-case-note-detectability/case-notes-detect-venv/bin/python3: No module named pip
This had previously been working fine on my previous version of Jupyterlab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools enhancement enhancing an existing feature feature-request
Projects
Archived in project
Development

No branches or pull requests

6 participants