Skip to content

Commit c6815d8

Browse files
增加了xinference对MiniCPM-Llama3-V 2.5的推理支持和demo
1 parent ef7cfa8 commit c6815d8

File tree

1 file changed

+56
-0
lines changed

1 file changed

+56
-0
lines changed

docs/xinference_infer.md

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
## about xinference
2+
xinference is a unified inference platform that provides a unified interface for different inference engines. It supports LLM, text generation, image generation, and more.but it's not bigger than swift too much.
3+
4+
## xinference install
5+
```shell
6+
pip install "xinference[all]"
7+
```
8+
9+
## quick start
10+
1. start xinference
11+
```shell
12+
xinference
13+
```
14+
2. start the web ui.
15+
3. Search for "MiniCPM-Llama3-V-2_5" in the search box.
16+
[alt text](../assets/xinferenc_demo_image/xinference_search_box.png)
17+
4. find and click the MiniCPM-Llama3-V-2_5 button.
18+
5. follow the config and launch the model.
19+
```plaintext
20+
Model engine : Transformers
21+
model format : pytorch
22+
Model size : 8
23+
quantization : none
24+
N-GPU : auto
25+
Replica : 1
26+
```
27+
6. after first click the launch button,xinference will download the model from huggingface. we should click the webui button.
28+
![alt text](../assets/xinferenc_demo_image/xinference_webui_button.png)
29+
7. upload the image and chatting with the MiniCPM-Llama3-V-2_5
30+
31+
## local MiniCPM-Llama3-V-2_5 launch
32+
1. start xinference
33+
```shell
34+
xinference
35+
```
36+
2. start the web ui.
37+
3. To register a new model, follow these steps: the settings highlighted in red are fixed and cannot be changed, whereas others are customizable according to your needs. Complete the process by clicking the 'Register Model' button.
38+
![alt text](../assets/xinferenc_demo_image/xinference_register_model1.png)
39+
![alt text](../assets/xinferenc_demo_image/xinference_register_model2.png)
40+
4. After completing the model registration, proceed to 'Custom Models' and locate the model you just registered.
41+
5. follow the config and launch the model.
42+
```plaintext
43+
Model engine : Transformers
44+
model format : pytorch
45+
Model size : 8
46+
quantization : none
47+
N-GPU : auto
48+
Replica : 1
49+
```
50+
6. after first click the launch button,xinference will download the model from huggingface. we should click the chat button.
51+
![alt text](../assets/xinferenc_demo_image/xinference_webui_button.png)
52+
7. upload the image and chatting with the MiniCPM-Llama3-V-2_5
53+
54+
## FAQ
55+
1. Why can't the sixth step open the WebUI?
56+
maybe your firewall or mac os to prevent the web to open.

0 commit comments

Comments
 (0)