We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
本地部署的agent,调用qwen-max模型。然后每建立一个对话都需要消耗相应的显存吗?多几个对话,显存满了就只能等待?
No response
The text was updated successfully, but these errors were encountered:
目前qwen-max使用的是dashscope api应该不用占显存,如果你使用vllm拉起的qwen小模型的话,目前链路上确实会有开多个对话,有占用多个显存的情况,之前没有遇到这个case。我们高优解一下。感谢提供反馈
Sorry, something went wrong.
本地部署的agent是参考用 sh scripts/run_assistant_server.sh么?
zzhangpurdue
No branches or pull requests
Description
本地部署的agent,调用qwen-max模型。然后每建立一个对话都需要消耗相应的显存吗?多几个对话,显存满了就只能等待?
Link
No response
The text was updated successfully, but these errors were encountered: