Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何使用本地部署的模型 #5200

Open
9963KK opened this issue Jan 21, 2025 · 5 comments
Open

如何使用本地部署的模型 #5200

9963KK opened this issue Jan 21, 2025 · 5 comments

Comments

@9963KK
Copy link

9963KK commented Jan 21, 2025

没太看懂怎么配置本地的大模型,没看到有可以选择本地大模型的选项,有说明怎么做这个部分吗?

Copy link

dosubot bot commented Jan 21, 2025

Hello @9963KK! I'm here to help you with any questions or issues you have. I'm a bot that can assist with bug fixes, answer questions, and help you become a contributor. Let me know how I can assist you!

要配置和使用本地部署的大模型,您可以按照以下步骤进行:

  1. 安装 Langchain-Chatchat
    确保您的环境中安装了 Python 3.8。然后,使用以下命令安装 Langchain-Chatchat:

    pip install langchain-chatchat -U
  2. 设置 Xinference 环境

    • 初始化 Conda:

      wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
      rm -rf ~/miniconda3/
      bash Miniconda3-latest-Linux-x86_64.sh
      conda config --remove channels https://mirrors.ustc.edu.cn/anaconda/pkgs/main/
      conda config --remove channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
      conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
      conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    • 创建并激活 chatchat 环境:

      conda create -p ~/miniconda3/envs/chatchat python=3.8
      conda activate ~/miniconda3/envs/chatchat
      pip install langchain-chatchat -U
      pip install xinference_client faiss-gpu "unstructured[pdf]"
    • 创建并激活 xinference 环境:

      conda create -p ~/miniconda3/envs/xinference python=3.8
      conda activate ~/miniconda3/envs/xinference
      pip install xinference --force
      pip install tiktoken sentence-transformers
  3. 启动 Xinference 服务

    conda activate ~/miniconda3/envs/xinference
    xinference-local
  4. 注册和启动模型

    • 编辑并运行注册模型脚本 model_registrations.shmodel_registrations_emb.sh,根据实际情况修改模型路径。
    • 启动模型:
      bash ./model_registrations.sh
      bash ./model_registrations_emb.sh
      bash ./start_models.sh
      bash ./start_models_emb.sh
  5. 配置和启动 Langchain-Chatchat

    • 初始化 chatchat 配置:

      conda activate ~/miniconda3/envs/chatchat
      chatchat-config basic --verbose true
      chatchat-config basic --data ~/chatchat-data
    • 设置模型:

      chatchat-config model --set_model_platforms "[{
          \"platform_name\": \"xinference\",
          \"platform_type\": \"xinference\",
          \"api_base_url\": \"http://127.0.0.1:9997/v1\",
          \"api_key\": \"EMPT\",
          \"api_concurrencies\": 5,
          \"llm_models\": [
              \"autodl-tmp-glm-4-9b-chat\"
          ],
          \"embed_models\": [
              \"bge-large-zh-v1.5\"
          ],
          \"image_models\": [],
          \"reranking_models\": [],
          \"speech2text_models\": [],
          \"tts_models\": []
      }]"
    • 初始化知识库:

      chatchat-kb -r
    • 启动 chatchat:

      chatchat -a

通过这些步骤,您可以配置和使用本地部署的大模型。请确保在不同的 Python 虚拟环境中运行 Langchain-Chatchat 和模型推理框架,以避免依赖冲突[1][2][3]

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@9963KK
Copy link
Author

9963KK commented Jan 23, 2025

谢谢,还有一个问题就是,注册模型的脚本文件是需要新建吗?调用api的话在哪里设置呢?

Copy link

dosubot bot commented Jan 23, 2025

注册模型的脚本文件是需要新建的。您需要创建一个名为 model_registrations.sh 的脚本文件,并在其中添加注册模型的 curl 命令。调用API的设置在 curl 命令中通过 --data-raw 参数指定,其中包含模型的详细信息和配置。具体步骤如下:

$ vim model_registrations.sh
# 添加以下内容。模型路径需要根据实际情况修改
curl 'http://127.0.0.1:9997/v1/model_registrations/LLM' \
  -H 'Accept: */*' \
  -H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Cookie: token=no_auth' \
  -H 'Origin: http://127.0.0.1:9997' \
  -H 'Referer: http://127.0.0.1:9997/ui/' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Chromium";v="124", "Google Chrome";v="124", "Not-A.Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  --data-raw '{"model":"{\"version\":1,\"model_name\":\"autodl-tmp-glm-4-9b-chat\",\"model_description\":\"autodl-tmp-glm-4-9b-chat\",\"context_length\":2048,\"model_lang\":[\"en\",\"zh\"],\"model_ability\":[\"generate\",\"chat\"],\"model_family\":\"glm4-chat\",\"model_specs\":[{\"model_uri\":\"/root/autodl-tmp/glm-4-9b-chat\",\"model_size_in_billions\":9,\"model_format\":\"pytorch\",\"quantizations\":[\"none\"]}],\"prompt_style\":{\"style_name\":\"CHATGLM3\",\"system_prompt\":\"\",\"roles\":[\"user\",\"assistant\"],\"intra_message_sep\":\"\",\"inter_message_sep\":\"\",\"stop\":[\"<endoftext>\",\"<|user|>\",\"<|observation|>\"],\"stop_token_ids\":[151329,151336,151338]}}","persist":true}'

这样,您就可以通过运行该脚本来注册模型[1]

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@9963KK
Copy link
Author

9963KK commented Jan 23, 2025

如何调用在线api呢?

Copy link

dosubot bot commented Jan 23, 2025

要调用在线API进行文本向量化,可以使用以下代码片段中的逻辑:

def embed_texts(
        texts: List[str],
        embed_model: str = EMBEDDING_MODEL,
        to_query: bool = False,
) -> BaseResponse:
    try:
        if embed_model in list_online_embed_models():  # 使用在线API
            config = get_model_worker_config(embed_model)
            worker_class = config.get("worker_class")
            embed_model = config.get("embed_model")
            worker = worker_class()
            if worker_class.can_embedding():
                params = ApiEmbeddingsParams(texts=texts, to_query=to_query, embed_model=embed_model)
                resp = worker.do_embeddings(params)
                return BaseResponse(**resp)
    except Exception as e:
        logger.error(e)
        return BaseResponse(code=500, msg=f"文本向量化过程中出现错误:{e}")

在这个函数中,首先检查所选的嵌入模型是否在 list_online_embed_models() 中。如果是,则获取模型的配置,实例化相应的工作类,并调用 do_embeddings 方法来执行嵌入操作。

此外,您可以使用 ApiRequest 类来简化API调用的过程。该类提供了同步和异步的请求方法,如 getpost,并支持重试机制和流式处理。以下是一个简单的示例:

from chatchat.webui_pages.utils import ApiRequest

api = ApiRequest()

response = api.post("/your/api/endpoint", json={"key": "value"})
if response:
    data = response.json()
    print(data)
else:
    print("请求失败")

通过这种方式,您可以更方便地与API进行交互,并处理可能出现的错误[1][2]

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant