如何使用本地部署的模型 #5200

9963KK · 2025-01-21T09:05:25Z

没太看懂怎么配置本地的大模型，没看到有可以选择本地大模型的选项，有说明怎么做这个部分吗？

dosubot · 2025-01-21T09:07:45Z

Hello @9963KK! I'm here to help you with any questions or issues you have. I'm a bot that can assist with bug fixes, answer questions, and help you become a contributor. Let me know how I can assist you!

要配置和使用本地部署的大模型，您可以按照以下步骤进行：

安装 Langchain-Chatchat：
确保您的环境中安装了 Python 3.8。然后，使用以下命令安装 Langchain-Chatchat：
```
pip install langchain-chatchat -U
```

设置 Xinference 环境：

初始化 Conda：

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
rm -rf ~/miniconda3/
bash Miniconda3-latest-Linux-x86_64.sh
conda config --remove channels https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
conda config --remove channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

创建并激活 chatchat 环境：

conda create -p ~/miniconda3/envs/chatchat python=3.8
conda activate ~/miniconda3/envs/chatchat
pip install langchain-chatchat -U
pip install xinference_client faiss-gpu "unstructured[pdf]"

创建并激活 xinference 环境：

conda create -p ~/miniconda3/envs/xinference python=3.8
conda activate ~/miniconda3/envs/xinference
pip install xinference --force
pip install tiktoken sentence-transformers

启动 Xinference 服务：

conda activate ~/miniconda3/envs/xinference
xinference-local

注册和启动模型：
- 编辑并运行注册模型脚本 model_registrations.sh 和 model_registrations_emb.sh，根据实际情况修改模型路径。
- 启动模型：
```
bash ./model_registrations.sh
bash ./model_registrations_emb.sh
bash ./start_models.sh
bash ./start_models_emb.sh
```

配置和启动 Langchain-Chatchat：

初始化 chatchat 配置：

conda activate ~/miniconda3/envs/chatchat
chatchat-config basic --verbose true
chatchat-config basic --data ~/chatchat-data

设置模型：

chatchat-config model --set_model_platforms "[{
    \"platform_name\": \"xinference\",
    \"platform_type\": \"xinference\",
    \"api_base_url\": \"http://127.0.0.1:9997/v1\",
    \"api_key\": \"EMPT\",
    \"api_concurrencies\": 5,
    \"llm_models\": [
        \"autodl-tmp-glm-4-9b-chat\"
    ],
    \"embed_models\": [
        \"bge-large-zh-v1.5\"
    ],
    \"image_models\": [],
    \"reranking_models\": [],
    \"speech2text_models\": [],
    \"tts_models\": []
}]"

初始化知识库：
```
chatchat-kb -r
```
启动 chatchat：
```
chatchat -a
```

通过这些步骤，您可以配置和使用本地部署的大模型。请确保在不同的 Python 虚拟环境中运行 Langchain-Chatchat 和模型推理框架，以避免依赖冲突^[1]^[2]^[3]。

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

9963KK · 2025-01-23T08:32:31Z

谢谢，还有一个问题就是，注册模型的脚本文件是需要新建吗？调用api的话在哪里设置呢？

dosubot · 2025-01-23T08:34:26Z

注册模型的脚本文件是需要新建的。您需要创建一个名为 model_registrations.sh 的脚本文件，并在其中添加注册模型的 curl 命令。调用API的设置在 curl 命令中通过 --data-raw 参数指定，其中包含模型的详细信息和配置。具体步骤如下：

$ vim model_registrations.sh
# 添加以下内容。模型路径需要根据实际情况修改
curl 'http://127.0.0.1:9997/v1/model_registrations/LLM' \
  -H 'Accept: */*' \
  -H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Cookie: token=no_auth' \
  -H 'Origin: http://127.0.0.1:9997' \
  -H 'Referer: http://127.0.0.1:9997/ui/' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Chromium";v="124", "Google Chrome";v="124", "Not-A.Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  --data-raw '{"model":"{\"version\":1,\"model_name\":\"autodl-tmp-glm-4-9b-chat\",\"model_description\":\"autodl-tmp-glm-4-9b-chat\",\"context_length\":2048,\"model_lang\":[\"en\",\"zh\"],\"model_ability\":[\"generate\",\"chat\"],\"model_family\":\"glm4-chat\",\"model_specs\":[{\"model_uri\":\"/root/autodl-tmp/glm-4-9b-chat\",\"model_size_in_billions\":9,\"model_format\":\"pytorch\",\"quantizations\":[\"none\"]}],\"prompt_style\":{\"style_name\":\"CHATGLM3\",\"system_prompt\":\"\",\"roles\":[\"user\",\"assistant\"],\"intra_message_sep\":\"\",\"inter_message_sep\":\"\",\"stop\":[\"<endoftext>\",\"<|user|>\",\"<|observation|>\"],\"stop_token_ids\":[151329,151336,151338]}}","persist":true}'

这样，您就可以通过运行该脚本来注册模型^[1]。

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

9963KK · 2025-01-23T08:36:19Z

如何调用在线api呢？

dosubot · 2025-01-23T08:38:06Z

要调用在线API进行文本向量化，可以使用以下代码片段中的逻辑：

def embed_texts(
        texts: List[str],
        embed_model: str = EMBEDDING_MODEL,
        to_query: bool = False,
) -> BaseResponse:
    try:
        if embed_model in list_online_embed_models():  # 使用在线API
            config = get_model_worker_config(embed_model)
            worker_class = config.get("worker_class")
            embed_model = config.get("embed_model")
            worker = worker_class()
            if worker_class.can_embedding():
                params = ApiEmbeddingsParams(texts=texts, to_query=to_query, embed_model=embed_model)
                resp = worker.do_embeddings(params)
                return BaseResponse(**resp)
    except Exception as e:
        logger.error(e)
        return BaseResponse(code=500, msg=f"文本向量化过程中出现错误：{e}")

在这个函数中，首先检查所选的嵌入模型是否在 list_online_embed_models() 中。如果是，则获取模型的配置，实例化相应的工作类，并调用 do_embeddings 方法来执行嵌入操作。

此外，您可以使用 ApiRequest 类来简化API调用的过程。该类提供了同步和异步的请求方法，如 get 和 post，并支持重试机制和流式处理。以下是一个简单的示例：

from chatchat.webui_pages.utils import ApiRequest

api = ApiRequest()

response = api.post("/your/api/endpoint", json={"key": "value"})
if response:
    data = response.json()
    print(data)
else:
    print("请求失败")

通过这种方式，您可以更方便地与API进行交互，并处理可能出现的错误^[1]^[2]。

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何使用本地部署的模型 #5200

如何使用本地部署的模型 #5200

9963KK commented Jan 21, 2025

dosubot bot commented Jan 21, 2025

9963KK commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

9963KK commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

如何使用本地部署的模型 #5200

如何使用本地部署的模型 #5200

Comments

9963KK commented Jan 21, 2025

dosubot bot commented Jan 21, 2025

9963KK commented Jan 23, 2025

dosubot bot commented Jan 23, 2025

9963KK commented Jan 23, 2025

dosubot bot commented Jan 23, 2025