A discord bot that can call LLMs using either Hugging Face or vLLM on the Windows platform.
Combined with function calling and RAG.
- Qwen1.5-14B-Chat-GPTQ-Int8 (w/ context windows length only 4096 due to 24G VRAM)
- Qwen1.5-14B-Chat-GPTQ-Int4
- Qwen1.5-7B-Chat-GPTQ-Int8
-
Dependency
After creating and activating a virtual environment.
Use either./setup.sh
orpip install -r requirements.txt
with some manual install (see more in the file).
To do customized adjustments, you can edit the above files. -
Environment Variables
Edit the content in the file ".env_sample" and rename it into ".env".
If you are not using Hugging Face pipelines/models but the vLLM server:
-
For Windows:
Since vLLM does not currently support Windows, we have to install vLLM using Docker.
-
Install Docker Desktop
-
Build the docker service (image) and create the docker container by:
docker-compose create
-
(Optional) Start the container either using the GUI, discord bot command, or the following command:
docker-compose start
-
-
For other platforms:
See "docker-compose.yml" for your reference.
python ./main.py
- My Web Extractor
- File Operator (My Storage)
- My Code Executor