-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Suggestion | Serverless Runpod #9
Comments
@PromptEngineer48
This worked for me. I did get an error because the container is broken and I had to go to the shell and do a pip install and restart the container. But once TheBloke fixes the container, using the templates this way will be nice and you won't have to restart the container when you select the openai plugin =) Later I'll play with this and serverless. Still not sure if that works...Note, if you go into your RunPod profile and set your public key setting it will automagically get set on each template you launch via environment variables to docker. |
Hi, @GeoffMillerAZ this is amazing, I've been trying to make this work since yesterday, I did all of the steps provided in the video, but:
Would it be okay to ask for your guidance on how you made this work? Thank you so much |
Things have changed. This should work... |
I see that Runpod has a serverless option. Rather than stopping and starting these instances, is it possible to use these models serverless? It looks like you can modify theBloke's dockerfile and configure a network volume to use the model in the workspace of the network volume.
I am trying to play with doing this, but I have been busy with work, and I don't know what I'm doing here. I have a lot of questions as to whether this is possible or practical. Does each request wait for the model to load into the VRAM?
Serverless could be a cheap and easy way to have permanent setups for using Autogen. This could be especially nice for having multiple serverless GPU endpoints for different AI models that specialize in specific tasks without having the cost or risk of leaving an instance running.
Also, can you set a custom API Key for your runpod endpoint? To make sure your endpoints don't get used by someone else.
The text was updated successfully, but these errors were encountered: