Skip to content

Conversation

fangyinc
Copy link
Collaborator

Description

  1. Support llama.cpp server inference
  2. API Server support /v1/completions
  3. Support native generate function

How Has This Been Tested?

Install dependencies

pip install -e ".[llama_cpp_server]"

If you want to accelerate the inference speed, and you have a GPU, you can install the following dependencies:

CMAKE_ARGS="-DGGML_CUDA=ON" pip install -e ".[llama_cpp_server]"

Download the model

Here, we use the qwen2.5-0.5b-instruct model as an example. You can download the model from the Huggingface.

wget https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_k_m.gguf?download=true -O /tmp/qwen2.5-0.5b-instruct-q4_k_m.gguf

Modify configuration file

In the .env configuration file, modify the inference type of the model to start llama.cpp inference.

LLM_MODEL=qwen2.5-0.5b-instruct
LLM_MODEL_PATH=/tmp/qwen2.5-0.5b-instruct-q4_k_m.gguf
MODEL_TYPE=llama_cpp_server

Start the DB-GPT server

python dbgpt/app/dbgpt_server.py

Snapshots:

Include snapshots for easier review.

Checklist:

  • My code follows the style guidelines of this project
  • I have already rebased the commits and make the commit message conform to the project standard.
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • Any dependent changes have been merged and published in downstream modules

@github-actions github-actions bot added enhancement New feature or request model Module: model labels Dec 31, 2024
Copy link
Collaborator

@Aries-ckt Aries-ckt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@csunny csunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@csunny csunny merged commit 0b2af2e into eosphoros-ai:main Jan 2, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request model Module: model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants