-
Notifications
You must be signed in to change notification settings - Fork 23
Troubleshooting
You need to research what GPU is in your computer. If you have an Nvidia GPU, select A
. If you have an Apple M Series laptop, select C
. If you have any other kind of GPU (AMD and Intel GPUs are not supported) or no GPU at all, select D
.
ChatLLaMA was built to run LLMs that you can chat with (think ChatGPT). This is the point where you have to download the LLM that you want to chat with. There are a number of models to choose from like LLaMA, Vicuna, Wizard, MPT, etc. Usually these models are found on the website called Hugging Face. If you are new to this, I recommend starting with one of the smallest models you can find like TheBloke/LLaMa-7B-GGML.
To use this model, select L
and then type TheBloke/LLaMa-7B-GGML
. This will download the model from Hugging Face's website. It will take some time to finish. Make sure you have enough space on your hard drive. These models are very big.
In order to create a Discord bot, you have to create a Discord bot application and invite it to the Discord server where you want it to be. After you create the application, you can obtain the token for it under Bot -> Token -> Reset Token.
Here are some reasons why the installation could fail:
- Not enough space on your computer for the install (it takes dozens of gigabytes, not even including the model)
- If this is the case, you need to free up enough space
-
text-generation-webui has made an update their installer which has broken my installer
- If this is the case, I should have a fix up soon
- Windows paths have a length limit and the install creates a lot of long paths which can run into this limit
- If this is the case, try installing to a location with a shorter path (
C:\
for example)
- If this is the case, try installing to a location with a shorter path (
- The Linux distribution being installed to does not have the required dependencies for installation
- If this is the case, there are issues open in text-generation-webui and in its one-click installer repository that may help
- There is another Anaconda, Miniconda, Python, or PyTorch install which is conflicting with the local, ChatLLaMA Miniconda environment
- If this is the case, you could try removing those installs and trying again. There are issues open in text-generation-webui and in its one-click installer repository that may help
If you can't fix the problem and it doesn't happen when you run text-generation-webui, open up an issue. Make sure to include your entire install log, operating system, model, and GPU.
There are some basics that should be known for running a model:
- Some models only work on Nvidia GPUs
- If you see a model labeled GPTQ, it only works on Nvidia GPUs
- Some models only work on the CPU
- If you see a model labeled GGML or using the
q4
,q5
, etc nomenclature, it can only be run on the CPU
- If you see a model labeled GGML or using the
- A model running on the GPU will use the GPU's VRAM
- A model running on the CPU will use the normal computer RAM
- Some models can be too big to fit into RAM/VRAM
- Monitor your RAM/VRAM usage as you run the model and if you see the model crash soon after reaching this limit, this might be your issue
- This is why you should start out with the smallest model you can find like a 4bit quantized 7b model
- The size of a model on disk is a useful indicator of the size of the model in memory
- The less parameters a model has, the smaller it will be (3b < 6b < 7b < 13b < 30b < ....etc) but quantization of a model reduces the size even further (3bit < 4bit < 8bit)
There are some workarounds which can be done to run a model that is larger than the amount of RAM you have like using the --gpu-memory
, --cpu-memory
, --auto-devices
, --disk
, and --pre_layer
flags with text-generation-webui which can limit memory, split memory, and offset memory to other locations. Usually this comes with a large runtime cost though.
Another common issue is when trying to run a model on the CPU, the --cpu
flag may need to be used.
To add flags to ChatLLaMA, open the downloaded script with a text editor and edit the CHATLLAMA_FLAGS
line with your flags. For example, if I wanted to add the --cpu
flag to the Windows script, it would look like set CHATLLAMA_FLAGS=--cpu
.
Open token.cfg
with a text editor and simply replace the token you have in there with your new token.
The easiest way is to run the cmd_windows/linux/macos.bat/sh
file under the ChatLLaMA/oobabooga
folder and execute the commands cd text-generation-webui
and python download-model.py
to get back to the Hugging Face download menu that was seen on installation.
Models can also be added manually by downloading the model from wherever and moving the model's folder to ChatLLaMA/oobabooga/text-generation-webui/models
.
The easiest way is to download the latest ChatLLaMA install script, install ChatLLaMA again, and move any files that you want to keep from your previous install like your models in ChatLLaMA/oobabooga/text-generation-webui/models
.
If you really don't want to do this, there is an alternative, manual method.
ChatLLaMA depends on text-generation-webui to function, but it is changing constantly. This presents a problem because in order for ChatLLaMA to be working, both ChatLLaMA and text-generation-webui have to be in sync. The installer solves this issue by installing text-generation-webui pinned at the latest commit which ChatLLaMA supports. So to update ChatLLaMA manually, you have to copy over the latest bot.py
and update text-generation-webui to the latest commit that ChatLLaMA supports.
Updating bot.py
is pretty simple. Just replace the bot.py
already in ChatLLaMA/oobabooga/text-generation-webui
with the latest one. To update text-generation-webui, run the cmd script located in ChatLLaMA/oobabooga
, cd to the text-generation-webui
directory, run the command git checkout main
, run the update script located in ChatLLaMA/oobabooga
, get the latest supported commit from the source code of the installation script, open a command line in the ChatLLaMA/oobabooga/text-generation-webui
directory, and run the command git checkout <commit here>
.