Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

调用qwen-max模型,还需要消耗显存吗 #499

Open
liutong0127 opened this issue Jun 20, 2024 · 2 comments
Open

调用qwen-max模型,还需要消耗显存吗 #499

liutong0127 opened this issue Jun 20, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@liutong0127
Copy link

Description

本地部署的agent,调用qwen-max模型。然后每建立一个对话都需要消耗相应的显存吗?多几个对话,显存满了就只能等待?

Link

No response

@zzhangpurdue
Copy link
Collaborator

目前qwen-max使用的是dashscope api应该不用占显存,如果你使用vllm拉起的qwen小模型的话,目前链路上确实会有开多个对话,有占用多个显存的情况,之前没有遇到这个case。我们高优解一下。感谢提供反馈

@zzhangpurdue
Copy link
Collaborator

本地部署的agent是参考用 sh scripts/run_assistant_server.sh么?

@zzhangpurdue zzhangpurdue self-assigned this Jun 21, 2024
@zzhangpurdue zzhangpurdue added the bug Something isn't working label Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants