[Bug]: Librechat doesn't wait until Ollama has loaded model #3330

vlbosch · 2024-07-12T07:37:59Z

What happened?

When starting a conversation with a model served via Ollama, it sometimes happens the prompt stops prematurely with an ETIMEDOUT error. After resending the prompt, it is answered correctly, but only because the model has been loaded by then. With larger models (like Command R Plus), it takes several retries.

Steps to Reproduce

Start a chat with an Ollama-served model
Prompt the model
Send the prompt
See the general error message appear

What browsers are you seeing the problem on?

Safari

Relevant log output

Librechat error-log: 
{"cause":{"code":"ETIMEDOUT","errno":"ETIMEDOUT","message":"request to http://host.docker.internal:11434/v1/chat/completions failed, reason: read ETIMEDOUT","type":"system"},"level":"error","message":"[handleAbortError] AI response error; aborting request: Connection error.","stack":"Error: Connection error.\n    at OpenAI.makeRequest (/app/api/node_modules/openai/core.js:292:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async ChatCompletionStream._createChatCompletion (/app/api/node_modules/openai/lib/ChatCompletionStream.js:53:24)\n    at async ChatCompletionStream._runChatCompletion (/app/api/node_modules/openai/lib/AbstractChatCompletionRunner.js:314:16)"}

Librechat debug-log: 
2024-07-12T05:36:03.543Z debug: [OpenAIClient] chatCompletion
{
  baseURL: "http://host.docker.internal:11434/v1",
    modelOptions.model: "gemma-2-27b-it-Q8_0_L.gguf:latest",
    modelOptions.temperature: 0.7,
    modelOptions.top_p: 1,
    modelOptions.presence_penalty: 0,
    modelOptions.frequency_penalty: 0,
    modelOptions.stop: undefined,
    modelOptions.max_tokens: undefined,
    modelOptions.user: "668d8e6941ec54b9987a6bcc",
    modelOptions.stream: true,
    // 2 message(s)
    modelOptions.messages: [{"role":"system","name":"instructions","content":"Instructions:\nAntwoord uitsluitend in het Nederla... [truncated],{"role":"user","content":"Dit is een test."}],
}
2024-07-12T05:36:03.548Z debug: Making request to http://host.docker.internal:11434/v1/chat/completions
2024-07-12T05:36:15.288Z debug: Making request to http://host.docker.internal:11434/v1/chat/completions
2024-07-12T05:36:27.455Z debug: Making request to http://host.docker.internal:11434/v1/chat/completions
2024-07-12T05:36:38.743Z warn: [OpenAIClient.chatCompletion][stream] API error
2024-07-12T05:36:38.743Z error: [handleAbortError] AI response error; aborting request: Connection error.
2024-07-12T05:36:38.747Z debug: [AskController] Request closed
2024-07-12T06:27:35.786Z debug: [AskController]

Ollama-server log:
level=WARN source=server.go:570 msg="client connection closed before server finished loading, aborting load"
level=ERROR source=sched.go:480 msg="error loading llama server" error="timed out waiting for llama runner to start: context canceled"

Screenshots

No response

Code of Conduct

I agree to follow this project's Code of Conduct

vlbosch · 2024-07-13T09:48:21Z

I tested a bit more. Even when the model is loaded, but prompted with a large prompt, a timeout occurs. I think that the timeout should be removed when using Ollama, and only present an error when Ollama does.

vlbosch · 2024-08-21T09:02:48Z

@danny-avila Did you have a chance to look into this issue? Would be great if timeouts for custom endpoints can be changed and/or disabled completely. Thanks!

vlbosch added the bug Something isn't working label Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Librechat doesn't wait until Ollama has loaded model #3330

[Bug]: Librechat doesn't wait until Ollama has loaded model #3330

vlbosch commented Jul 12, 2024

vlbosch commented Jul 13, 2024

vlbosch commented Aug 21, 2024

[Bug]: Librechat doesn't wait until Ollama has loaded model #3330

[Bug]: Librechat doesn't wait until Ollama has loaded model #3330

Comments

vlbosch commented Jul 12, 2024

What happened?

Steps to Reproduce

What browsers are you seeing the problem on?

Relevant log output

Screenshots

Code of Conduct

vlbosch commented Jul 13, 2024

vlbosch commented Aug 21, 2024