-
Notifications
You must be signed in to change notification settings - Fork 599
Closed
Description
📚 The doc issue
LMDeploy, a flexible and high-performance inference and serving framework tailored for large language models, now supports DeepSeek-V3. It offers both offline pipeline processing and online deployment capabilities, seamlessly integrating with PyTorch-based workflows.
Installation
git clone -b support-dsv3 https://github.com/InternLM/lmdeploy.git
cd lmdeploy
pip install -e .
Offline Inference Pipeline
from lmdeploy import pipeline, PytorchEngineConfig
if __name__ == "__main__":
pipe = pipeline("deepseek-ai/DeepSeek-V3-FP8", backend_config=PytorchEngineConfig(tp=8))
messages_list = [
[{"role": "user", "content": "Who are you?"}],
[{"role": "user", "content": "Translate the following content into Chinese directly: DeepSeek-V3 adopts innovative architectures to guarantee economical training and efficient inference."}],
[{"role": "user", "content": "Write a piece of quicksort code in C++."}],
]
output = pipe(messages_list)
print(output)
Online Serving
# run
lmdeploy serve api_server deepseek-ai/DeepSeek-V3-FP8 --tp 8 --backend pytorch
To access the service, you can utilize the official OpenAI Python package pip install openai
. Below is an example demonstrating how to use the entrypoint v1/chat/completions
from openai import OpenAI
client = OpenAI(
api_key='YOUR_API_KEY',
base_url="http://0.0.0.0:23333/v1"
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=[
{"role": "user", "content": "Write a piece of quicksort code in C++."}
],
temperature=0.8,
top_p=0.8
)
print(response)
For more information, please refer to the following link: https://github.com/InternLM/lmdeploy/tree/support-dsv3
Suggest a potential alternative/fix
No response
levysantanna
Metadata
Metadata
Assignees
Labels
No labels