Open
Description
LocalAI version:
https://github.com/go-skynet/LocalAI/tree/30f120ee6a7979aa4ffbd94ccdc1e58b94155a44
Environment, CPU architecture, OS, and Version:
Darwin N7L493PWK4 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul 5 22:22:05 PDT 2023; root:xnu-8796.141.3~6/RELEASE_ARM64_T6000 arm64
Describe the bug
When placing a model name that doesn't exist, two issues happen:
- The
curl
takes over 1 minute - It prints
all backends returned error: 23 errors occurred
To Reproduce
I am using a model name foo.bin
(chosen because it doesn't match any downloaded):
> curl http://localhost:8080/models
{"object":"list","data":[{"id":"thebloke__wizardlm-13b-v1-0-uncensored-superhot-8k-ggml__wizardlm-13b-v1.0-superhot-8k.ggmlv3.q4_k_m.bin","object":"model"}]}
> curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
"model": "foo.bin",
"prompt": "A long time ago in a galaxy far, far away",
"temperature": 0.7
}'
{"error":{"code":500,"message":"could not load model - all backends returned error: 23 errors occurred:\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unknown desc = failed loading model\n\t* could not load model: rpc error: code = Unavailable desc = error reading from server: EOF\n\t* could not load model: rpc error: code = Unknown desc = stat /models/foo.bin: no such file or directory\n\t* could not load model: rpc error: code = Unknown desc = stat /models/foo.bin: no such file or directory\n\t* could not load model: rpc error: code = Unknown desc = unsupported model type /models/foo.bin (should end with .onnx)\n\t* backend unsupported: /build/extra/grpc/huggingface/huggingface.py\n\t* backend unsupported: /build/extra/grpc/autogptq/autogptq.py\n\t* backend unsupported: /build/extra/grpc/bark/ttsbark.py\n\t* backend unsupported: /build/extra/grpc/diffusers/backend_diffusers.py\n\t* backend unsupported: /build/extra/grpc/exllama/exllama.py\n\n","type":""}}
Formatting that message
:
could not load model - all backends returned error: 23 errors occurred:
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unknown desc = failed loading model
* could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
* could not load model: rpc error: code = Unknown desc = stat /models/foo.bin: no such file or directory
* could not load model: rpc error: code = Unknown desc = stat /models/foo.bin: no such file or directory
* could not load model: rpc error: code = Unknown desc = unsupported model type /models/foo.bin (should end with .onnx)
* backend unsupported: /build/extra/grpc/huggingface/huggingface.py
* backend unsupported: /build/extra/grpc/autogptq/autogptq.py
* backend unsupported: /build/extra/grpc/bark/ttsbark.py
* backend unsupported: /build/extra/grpc/diffusers/backend_diffusers.py
* backend unsupported: /build/extra/grpc/exllama/exllama.py
Expected behavior
A desirable behavior is:
curl
returns quickly (<1 sec) when non-installed model- Response has a better error message, saying the model doesn't match any locally installed model
In other words, the current behavior is a massive error message, and takes forever to actually show up. Ideally if a model isn't installed, a response comes immediately with a concise error message