下载glm4-chat时，选择8-bit，下载报错Server error: 500 - [address=127.0.0.1:4389, pid=15108] 'transfomg.word_embeddings.weight' #2522

MrTLin · 2024-11-05T11:04:03Z

Cuda:12.1
transformens:4.46.1
python:3.10
操作系统：windows 11
显卡：3090

V.0.16.2

1.pip install "xinference[transformers]"
2.xinference-local

1.pip install "xinference[transformers]"
2.xinference-local
3.浏览器打开http://localhost:9997
4.拉取glm4-chat，Engine选择transformers，quantization选择8-bit

如何解决这个问题:下载glm4-chat时，选择8-bit，下载报错：Server error: 500 - [address=127.0.0.1:4389, pid=15108] 'transfomg.word_embeddings.weight'

github-actions · 2024-11-12T19:03:31Z

This issue is stale because it has been open for 7 days with no activity.

XprobeBot added the gpu label Nov 5, 2024

XprobeBot added this to the v0.16 milestone Nov 5, 2024

github-actions bot added the stale label Nov 12, 2024

Provide feedback