加载chatglm-6b-int4-qe会报错 #35

ystyle · 2023-03-26T02:42:29Z

目录如下：

lxy52@YSTYLE-PC MINGW64 /d/Code/Python/ChatGLM-6B (main)
$ tree -d -L 2
.
|-- ChatGLM-webui
|   |-- modules
|   |-- outputs
|   `-- scripts
|-- THUDM
|   |-- chatglm-6b
|   |-- chatglm-6b-int4
|   |-- chatglm-6b-int4-qe
|   `-- chatglm-6b-main
|-- examples
|-- limitations
|-- outputs
|   |-- markdown
|   `-- save
`-- resources

15 directories

在/d/Code/Python/ChatGLM-6B/ChatGLM-webui 下运行 python .\webui.py --model-path ..\THUDM\chatglm-6b-int4-qe\ 会报如下错误，加载chatglm-6b-int4也会报错，是目录不能用回退的方式加载么？还是什么原因，上周某个版本就可以的，更新了后就不行了

(ChatGLM) PS D:\Code\Python\ChatGLM-6B\ChatGLM-webui> python .\webui.py --model-path ..\THUDM\chatglm-6b-int4-qe\
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
No compiled kernel found.
Compiling kernels : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c -shared -o C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Kernels compiled : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Cannot load cpu kernel, don't use quantized model on cpu.
Using quantization cache
Applying quantization to glm layers
GPU memory: 8.59 GB
No compiled kernel found.
Compiling kernels : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.c -shared -o C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Kernels compiled : C:\Users\lxy52\.cache\huggingface\modules\transformers_modules\quantization_kernels_parallel.so
Traceback (most recent call last):
  File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\webui.py", line 52, in <module>
    init()
  File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\webui.py", line 24, in init
    load_model()
  File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\modules\model.py", line 61, in load_model
    prepare_model()
  File "D:\Code\Python\ChatGLM-6B\ChatGLM-webui\modules\model.py", line 42, in prepare_model
    model = model.half().quantize(4).cuda()
  File "C:\Users\lxy52/.cache\huggingface\modules\transformers_modules\modeling_chatglm.py", line 1281, in quantize
    load_cpu_kernel(**kwargs)
  File "C:\Users\lxy52/.cache\huggingface\modules\transformers_modules\quantization.py", line 390, in load_cpu_kernel
    cpu_kernels = CPUKernel(**kwargs)
  File "C:\Users\lxy52/.cache\huggingface\modules\transformers_modules\quantization.py", line 157, in __init__
    kernels = ctypes.cdll.LoadLibrary(kernel_file)
  File "D:\Application\Miniconda3\envs\ChatGLM\lib\ctypes\__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
    self._handle = _dlopen(self._name, mode)
rnels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

The text was updated successfully, but these errors were encountered:

Akegarasu · 2023-03-30T06:09:26Z

try --precision fp16

yknBugs · 2023-04-08T05:50:45Z

这好像不是WebUi的问题而是ChatGLM本身的问题。可以看源仓库的这个issue: THUDM/ChatGLM-6B#162

另外你直接加载量化后的模型真的能够节省显存吗？为什么我直接加载量化后的模型，虽然一开始占用的显存确实大幅度减少了，但每句对话占用的显存却大幅增加了，综合导致对话轮数还不如直接以量化的形式加载原始模型。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

加载chatglm-6b-int4-qe会报错 #35

加载chatglm-6b-int4-qe会报错 #35

ystyle commented Mar 26, 2023 •

edited

Loading

Akegarasu commented Mar 30, 2023

yknBugs commented Apr 8, 2023

加载chatglm-6b-int4-qe会报错 #35

加载chatglm-6b-int4-qe会报错 #35

Comments

ystyle commented Mar 26, 2023 • edited Loading

Akegarasu commented Mar 30, 2023

yknBugs commented Apr 8, 2023

ystyle commented Mar 26, 2023 •

edited

Loading