Skip to content

Commit

Permalink
Ollama: Specify keep_alive via settings (#17906)
Browse files Browse the repository at this point in the history
  • Loading branch information
notpeter authored Sep 16, 2024
1 parent e66ea9e commit 67f149a
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 2 deletions.
6 changes: 4 additions & 2 deletions crates/language_model/src/provider/ollama.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ use gpui::{AnyView, AppContext, AsyncAppContext, ModelContext, Subscription, Tas
use http_client::HttpClient;
use ollama::{
get_models, preload_model, stream_chat_completion, ChatMessage, ChatOptions, ChatRequest,
ChatResponseDelta, OllamaToolCall,
ChatResponseDelta, KeepAlive, OllamaToolCall,
};
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
Expand Down Expand Up @@ -42,6 +42,8 @@ pub struct AvailableModel {
pub display_name: Option<String>,
/// The Context Length parameter to the model (aka num_ctx or n_ctx)
pub max_tokens: usize,
/// The number of seconds to keep the connection open after the last request
pub keep_alive: Option<KeepAlive>,
}

pub struct OllamaLanguageModelProvider {
Expand Down Expand Up @@ -156,7 +158,7 @@ impl LanguageModelProvider for OllamaLanguageModelProvider {
name: model.name.clone(),
display_name: model.display_name.clone(),
max_tokens: model.max_tokens,
keep_alive: None,
keep_alive: model.keep_alive.clone(),
},
);
}
Expand Down
2 changes: 2 additions & 0 deletions docs/src/assistant/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,8 @@ Depending on your hardware or use-case you may wish to limit or increase the con

If you specify a context length that is too large for your hardware, Ollama will log an error. You can watch these logs by running: `tail -f ~/.ollama/logs/ollama.log` (MacOS) or `journalctl -u ollama -f` (Linux). Depending on the memory available on your machine, you may need to adjust the context length to a smaller value.

You may also optionally specify a value for `keep_alive` for each available model. This can be an integer (seconds) or alternately a string duration like "5m", "10m", "1h", "1d", etc., For example `"keep_alive": "120s"` will allow the remote server to unload the model (freeing up GPU VRAM) after 120seconds.

### OpenAI {#openai}

1. Visit the OpenAI platform and [create an API key](https://platform.openai.com/account/api-keys)
Expand Down

0 comments on commit 67f149a

Please sign in to comment.