-
Notifications
You must be signed in to change notification settings - Fork 29
Description
The current llms.py module has the following pain points:
- Multiple wrapper classes - Separate
OpenAIClientWrapper,AnthropicClientWrapper, andBedrockClientWrapperclasses with duplicated logic - Manual provider detection - Custom
MODEL_NAME_MAPdictionary that needs manual updates for new models - Different response parsing for each provider
- Maintenance burden - Adding new providers requires implementing a new wrapper class and integrating that in the core application code
Solution
Replace the custom wrapper classes with a unified LLMClient abstraction layer backed by LiteLLM, which:
- Supports many LLM providers out of the box
- Handles provider detection automatically from model names
- Normalizes responses across all providers
- Provides built-in model metadata via
get_model_info()
The second goal here is also separate application code with LLM logic:
-
Provider API changes are isolated - When OpenAI, Anthropic, or Bedrock update their APIs, LiteLLM handles the burden. We just update the
litellmdependency and the rest of the codebase (extraction.py,summarization.py,memory_strategies.py, etc.) remains untouched. Previously, each wrapper class would need individual fixes. Any other fixes can be done. -
Zero-effort provider expansion - Adding support for new LLM providers requires no application code changes. Examples:
# Gemini - just use it await LLMClient.create_chat_completion(model="gemini/gemini-pro", messages=...) # Cohere - just use it await LLMClient.create_chat_completion(model="command-r-plus", messages=...) # Local Ollama - just use it await LLMClient.create_chat_completion(model="ollama/llama3", messages=..., api_base="http://localhost:11434")
With the old design, each of these would require a new
*ClientWrapperclass. -
Future enhancement/swappability - The abstraction should allow replacing LiteLLM or adding a different library without touching application code.