We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在多卡并行文档中关于多机推理的 DISTRIBUTE_CONFIG_FILE 的示例文件如下所示:
{ "llama13B_2A10_PCIE_1_inference_part0": { "name": "llama13B_2A10_PCIE_1_inference_part0", "ip": "11.161.48.116", "port": 10000 }, "llama13B_2A10_PCIE_1_inference_part1": { "name": "llama13B_2A10_PCIE_1_inference_part1", "ip": "11.161.48.116", "port": 20000 } }
模型被拆分部署到两个节点中,这个示例文件中的 ip 字段部分在两个节点中相同,带有一定的迷惑性。实际测试在多节点部署的时候 ip 字段应该填写为各自节点对应的 ip 地址。
p.s. 对于多机部署启动时需要填写的环境变量似乎没有相关的文档说明。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
📚 问题
在多卡并行文档中关于多机推理的 DISTRIBUTE_CONFIG_FILE 的示例文件如下所示:
模型被拆分部署到两个节点中,这个示例文件中的 ip 字段部分在两个节点中相同,带有一定的迷惑性。实际测试在多节点部署的时候 ip 字段应该填写为各自节点对应的 ip 地址。
相关
The text was updated successfully, but these errors were encountered: