-
Notifications
You must be signed in to change notification settings - Fork 870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker swarm with TorchServe workflow #3206
Comments
Hi @KD1994 This is not something that we have tried. We do have kubernetes and kserve support. I would start with something simpler. Just a simple model served through docker swarm and see if you are not seeing these performance issues. |
Thanks, @agunapal for the quick response. That is exactly my plan of action right now for testing it out even further for all the things possible. I just wanted to see if anyone had tried this and faced any issue with this. I'll let you know in case If I still see this issue. Out of curiosity,
|
Yes, I have. if you are using aws, setup a cluster using https://github.com/aws-samples/aws-do-eks and then use this to launch torchserve with a bert model, aws-solutions-library-samples/guidance-for-machine-learning-inference-on-aws#15 |
Ok, thanks for the info. I will look into this. |
@agunapal thanks for your time. I was able to get this done. So, I'll be closing this. It turned out the issue is with NFS share configuration, neither TorchServe nor Docker swarm. |
That's awesome. Great to hear. |
I want to scale the workflows through "Docker Swarm". (I hope it is possible, if not please tell me how one can achieve this? I know it is not supported yet through TorchServe directly, that is why I'm using docker to scale the workflow.)
I have few questions related to using TorchServe as a docker service in swarm mode while I encountered few issues.
Problem Statement:
m5
is being executed and new request came then the current request directly stops (at least in the logs it looked like that, but no error was thrown) and new one starts. Correct me if I'm wrong but old request should be executing in the background, right?My Docker Swarm Config:
My project config:
(Please ignore the timeout, as I've put it this way because my inference request takes around 10 mins, as it takes over 100 images to process in a batch)
Python Packages:
The text was updated successfully, but these errors were encountered: