✨ JupyterLab Image with GLIBC V2.32 - To Enable to Run Large Language Models Locally #3041

foufoulides · 2024-01-19T15:22:21Z

Describe the feature request.

I am currently in the Alpha Phase (which had a total of 6 weeks and now has 4 weeks remaining) with Digital and HMPPS where I am utilizing LLMs to allow prison staff to make prisoner Casenotes more discoverable.
I am currently working with dummy data that I created using the OpenAI API, but we need to test effectiveness of my work on a small number of real Casenotes, which are much messier than the dummy data. In order to eliminate any privacy concerns, I need to use LLMs locally, stored on an S3 bucket. These models are implemented in C and need GNU C Library (GLIBC) V2.32 to be read. Could I have an image of JupyterLab which has GLIBC 2.32 please and if possible with 16GB as some of the larger LLMs require that to run?

Describe the context.

I am currently in the Alpha Phase (which had a total of 6 weeks and now has 4 weeks remaining) with Digital and HMPPS where I am utilizing LLMs to allow prison staff to make prisoner Casenotes more discoverable.

Value / Purpose

To assess the effectiveness and of my work on real Casenotes (which are much messier) to see feasibility of the various approaches, which will impact our decisions for the Beta Phase of this project

User Types

There are 16 staff members working on this project from Prisons Data Science, HMPPS, and Digital.

jacobwoffenden · 2024-02-01T14:56:34Z

Removing from sprint 9 as this cannot be progressed until we explore Analytical Platform tooling images again

foufoulides · 2024-02-01T15:18:13Z

To add some more specific information on the Local LLMs I am trying to use on the AP for some small scale testing purposes:
I am using LangChain, which has options to Run LLMs Locally of c-implementations of popular open source LLMs. The most applicable choice from reading the documentation in the above link seemed to be GPT4All (which has nothing to do with OpenAI) which offers several such c-implemented of popular open source LLMs. If you scroll down to the Model Explorer section of the GPT4All page, you will see the options available. They are all in .gguf format (though I looked for versions online of the same models that were .bin format) GPT4All has switched recently to only .gguf format. Both scenarios gave me the same GNU C Compiler error and wanted version 2.32. Some of the larger models require the machine to have 16GB of RAM. Ideally we want to use the smallest model that does our work satisfactory, but we are not sure what that would be yet, so if possible it would be good to have the option to run the largest models as well.

jacobwoffenden · 2024-02-05T16:53:47Z

Hi @foufoulides!

We've just released a new version of Jupyter Lab to yourself and @lucypitches

You can see it on your Control Panel as v3.6.3, and it now has 2 CPUs and 16GB of RAM

Can you deploy it and let us know how you get on with testing

Thanks,

Jacob, @julialawrence, @michaeljcollinsuk

foufoulides · 2024-02-06T09:46:25Z

Hi Jacob,

Thank you so much for this! I will check it this morning and get back to you.

Best,
Chris

jacobwoffenden · 2024-02-08T10:14:09Z

Feedback from @foufoulides is that the LLM is working 🎉 Closing as complete!

lucypitches · 2024-02-21T11:41:41Z

Hello, I've just deployed the new version. However, when I try install packages to my venv (python3 -m pip install -r requirements.txt) I get the following error: /home/jovyan/prison-case-note-detectability/case-notes-detect-venv/bin/python3: No module named pip
This had previously been working fine on my previous version of Jupyterlab.

foufoulides added enhancement enhancing an existing feature feature-request labels Jan 19, 2024

moj-data-platform-robot added this to Analytical Platform Jan 19, 2024

Ed-Bajo moved this to 🧐 To Do in Analytical Platform Jan 30, 2024

Ed-Bajo added data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools and removed data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools labels Jan 30, 2024

github-actions bot mentioned this issue Feb 1, 2024

Monthly issue metrics report #3148

Closed

jacobwoffenden mentioned this issue Feb 1, 2024

⬆️ Upgrade JupyterLab to support Python >3.10 #2348

Closed

6 tasks

jacobwoffenden self-assigned this Feb 5, 2024

jacobwoffenden moved this from 🧐 To Do to 💨 In Progress in Analytical Platform Feb 5, 2024

jacobwoffenden assigned michaeljcollinsuk and julialawrence Feb 5, 2024

jacobwoffenden moved this from 💨 In Progress to 👀 In Review in Analytical Platform Feb 5, 2024

jacobwoffenden closed this as completed Feb 8, 2024

jacobwoffenden moved this from 👀 In Review to 🎉 Done in Analytical Platform Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ JupyterLab Image with GLIBC V2.32 - To Enable to Run Large Language Models Locally #3041

✨ JupyterLab Image with GLIBC V2.32 - To Enable to Run Large Language Models Locally #3041

foufoulides commented Jan 19, 2024

jacobwoffenden commented Feb 1, 2024

foufoulides commented Feb 1, 2024 •

edited

Loading

jacobwoffenden commented Feb 5, 2024

foufoulides commented Feb 6, 2024

jacobwoffenden commented Feb 8, 2024

lucypitches commented Feb 21, 2024

✨ JupyterLab Image with GLIBC V2.32 - To Enable to Run Large Language Models Locally #3041

✨ JupyterLab Image with GLIBC V2.32 - To Enable to Run Large Language Models Locally #3041

Comments

foufoulides commented Jan 19, 2024

Describe the feature request.

Describe the context.

Value / Purpose

User Types

jacobwoffenden commented Feb 1, 2024

foufoulides commented Feb 1, 2024 • edited Loading

jacobwoffenden commented Feb 5, 2024

foufoulides commented Feb 6, 2024

jacobwoffenden commented Feb 8, 2024

lucypitches commented Feb 21, 2024

foufoulides commented Feb 1, 2024 •

edited

Loading