Skip to content

"Illegal instruction" when trying to run the server using a precompiled docker image #272

Closed
@vmajor

Description

@vmajor

Expected Behavior

I am trying to execute this:

docker run --rm -it -p 8000:8000 -v /home/xxxx/models:/models -e MODEL=/models/gpt4-alpaca-lora_mlp-65B/gpt4-alpaca-lora_mlp-65B.ggmlv3.q5_1.bin ghcr.io/abetlen/llama-cpp-python:latest

and I expect the model to load and server to start. I am using the model quantized by The Bloke according to the current latest specs of llama.ccp ggml implementation

Current Behavior

llama.cpp: loading model from /models/gpt4-alpaca-lora_mlp-65B/gpt4-alpaca-lora_mlp-65B.ggmlv3.q5_1.bin
Illegal instruction

Environment and Context

Linux DESKTOP-xxx 5.15.68.1-microsoft-standard-WSL2+ #2 SMP

$ python3 3.10.9
$ make GNU Make 4.3
$ g++ (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

docker run --rm -it -p 8000:8000 -v /home/xxxx/models:/models -e MODEL=/models/gpt4-alpaca-lora_mlp-65B/gpt4-alpaca-lora_mlp-65B.ggmlv3.q5_1.bin ghcr.io/abetlen/llama-cpp-python:latest

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions