Skip to content

Error on chat with loaded IBM Granite 3.3 2B model from the LocalAI repository #5216

Open
@bhaskars-repo

Description

@bhaskars-repo

Successfully installed the IBM Granite 3.3 2B from the LocalAI repo.

On trying to chat using the command:

curl -s http://192.168.1.25:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "ibm-granite_granite-3.3-2b-instruct",
"messages": [{"role": "user", "content": "Describe llm model using less than 50 words"}],
"temperature": 0.2
}' | jq

results in a failure:

1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stderr common_init_from_params: setting dry_penalty_last_n to ctx_size = 4096
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stderr common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stdout {"timestamp":1745071171,"level":"INFO","function":"initialize","line":574,"message":"initializing slots","n_slots":1}
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stderr set_warmup: value = 0
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stdout {"timestamp":1745071171,"level":"INFO","function":"initialize","line":583,"message":"new slot","slot_id":0,"n_ctx_slot":4096}
1:59PM INF [llama-cpp] Loads OK
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stdout {"timestamp":1745071171,"level":"INFO","function":"launch_slot_with_data","line":956,"message":"slot is processing task","slot_id":0,"task_id":0}
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stdout {"timestamp":1745071171,"level":"INFO","function":"update_slots","line":1888,"message":"kv cache rm [p0, end)","slot_id":0,"task_id":0,"p0":0}
1:59PM DBG GRPC(ibm-granite_granite-3.3-2b-instruct-127.0.0.1:36571): stderr get_logits_ith: invalid logits id 21, reason: no logits
1:59PM ERR Server error error="rpc error: code = Unavailable desc = error reading from server: EOF" ip=192.168.1.25 latency=9.302167221s method=POST status=500 url=/v1/chat/completions

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions