-
Notifications
You must be signed in to change notification settings - Fork 10
Closed
Description
For better measurements we need to preload the ollama model before prompting to it. We also need to cleanup afterwards
Tasks:
- Check how to preload models - https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-pre-load-a-model-to-get-faster-response-times
- Check how to unload models - https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately
- Check how to query if a model is loaded - https://www.reddit.com/r/ollama/comments/1cex92f/possible_to_show_currently_loaded_models_via_api/
- newest version has
ollama psto check all loaded models - however, if we use an empty prompt request to trigger model pre-loading, we can be sure that after the API answers this request, the model is indeed loaded see here
- newest version has
- Implement it into the evaluation run to preload ollama models when they should be used
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request