|
| 1 | +## about xinference |
| 2 | +xinference is a unified inference platform that provides a unified interface for different inference engines. It supports LLM, text generation, image generation, and more.but it's not bigger than swift too much. |
| 3 | + |
| 4 | +## xinference install |
| 5 | +```shell |
| 6 | +pip install "xinference[all]" |
| 7 | +``` |
| 8 | + |
| 9 | +## quick start |
| 10 | +1. start xinference |
| 11 | +```shell |
| 12 | +xinference |
| 13 | +``` |
| 14 | +2. start the web ui. |
| 15 | +3. Search for "MiniCPM-Llama3-V-2_5" in the search box. |
| 16 | +[alt text](../assets/xinferenc_demo_image/xinference_search_box.png) |
| 17 | +4. find and click the MiniCPM-Llama3-V-2_5 button. |
| 18 | +5. follow the config and launch the model. |
| 19 | +```plaintext |
| 20 | +Model engine : Transformers |
| 21 | +model format : pytorch |
| 22 | +Model size : 8 |
| 23 | +quantization : none |
| 24 | +N-GPU : auto |
| 25 | +Replica : 1 |
| 26 | +``` |
| 27 | +6. after first click the launch button,xinference will download the model from huggingface. we should click the webui button. |
| 28 | + |
| 29 | +7. upload the image and chatting with the MiniCPM-Llama3-V-2_5 |
| 30 | + |
| 31 | +## local MiniCPM-Llama3-V-2_5 launch |
| 32 | +1. start xinference |
| 33 | +```shell |
| 34 | +xinference |
| 35 | +``` |
| 36 | +2. start the web ui. |
| 37 | +3. To register a new model, follow these steps: the settings highlighted in red are fixed and cannot be changed, whereas others are customizable according to your needs. Complete the process by clicking the 'Register Model' button. |
| 38 | + |
| 39 | + |
| 40 | +4. After completing the model registration, proceed to 'Custom Models' and locate the model you just registered. |
| 41 | +5. follow the config and launch the model. |
| 42 | +```plaintext |
| 43 | +Model engine : Transformers |
| 44 | +model format : pytorch |
| 45 | +Model size : 8 |
| 46 | +quantization : none |
| 47 | +N-GPU : auto |
| 48 | +Replica : 1 |
| 49 | +``` |
| 50 | +6. after first click the launch button,xinference will download the model from huggingface. we should click the chat button. |
| 51 | + |
| 52 | +7. upload the image and chatting with the MiniCPM-Llama3-V-2_5 |
| 53 | + |
| 54 | +## FAQ |
| 55 | +1. Why can't the sixth step open the WebUI? |
| 56 | +maybe your firewall or mac os to prevent the web to open. |
0 commit comments