Releases
v0.4.0
[0.4.0] - 2025-08-28
Added
🧠 SLM-First Architecture (New)
Policy-Driven Routing Engine : Intelligent routing between Small Language Models (SLMs) and Large Language Models (LLMs)
Task Classification System : Automatic categorization of requests for optimal model selection
Task-aware routing with capability matching
Pattern recognition and keyword analysis for task classification
Confidence-Based Quality Control : Adaptive learning system for model performance tracking
crates/runtime/src/routing/confidence.rs
: Confidence monitoring and threshold management
Real-time quality assessment with configurable confidence thresholds
Automatic fallback on low-confidence responses
⚡ Performance & Reliability
Thread-Safe Operations : Full async/await support with proper concurrency handling
Error Recovery : Graceful fallback mechanisms with exponential backoff retry logic
Runtime Configuration : Dynamic policy updates and threshold adjustments without restart
Comprehensive Logging : Detailed audit trail of routing decisions and performance metrics
Improved
Routing & Model Management
Model Catalog Integration : Deep integration with existing model catalog for SLM selection
Resource Management : Intelligent resource allocation and constraint handling
Load Balancing : Multiple strategies for distributing requests across available models
Scheduler Integration : Seamless integration with the existing agent scheduler
Developer Experience
Comprehensive Testing : Complete test coverage for all routing components with mock implementations
Documentation : Extensive design documents and implementation guides
Configuration Validation : Enhanced validation of routing policies and model configurations
Fixed
Module Exports : Fixed routing module structure in crates/runtime/src/routing/mod.rs
Added missing pub mod config;
and pub mod policy;
declarations
Added corresponding pub use
statements for proper re-exports
Task Type Updates : Replaced deprecated TaskType::TextGeneration
with TaskType::CodeGeneration
Updated routing engine references throughout codebase
Fixed task type usage in test modules and policy evaluation
Import Resolution : Resolved compilation errors in routing components
Updated ModelLogger constructor calls to match current API
Fixed import paths in test modules for proper dependency resolution
Code Quality : Applied clippy suggestions and resolved all warnings
Improved code patterns and removed unused imports
Enhanced error handling and async operation safety
Performance Improvements
Routing Throughput : Optimized routing decision performance with efficient policy evaluation
Memory Efficiency : Reduced memory overhead in confidence monitoring and statistics tracking
Async Operations : Enhanced async runtime efficiency for concurrent request handling
Configuration Loading : Optimized configuration parsing and validation performance
Breaking Changes
Routing API : New routing engine interface with SLM-first architecture
Task Classification : Updated task type enumeration with CodeGeneration
replacing TextGeneration
Configuration Schema : Enhanced routing configuration structure with policy-driven settings
You can’t perform that action at this time.