Skip to content

v0.4.0

Latest
Compare
Choose a tag to compare
@jaschadub jaschadub released this 28 Aug 20:42
· 5 commits to main since this release

[0.4.0] - 2025-08-28

Added

🧠 SLM-First Architecture (New)

  • Policy-Driven Routing Engine: Intelligent routing between Small Language Models (SLMs) and Large Language Models (LLMs)
  • Task Classification System: Automatic categorization of requests for optimal model selection
    • Task-aware routing with capability matching
    • Pattern recognition and keyword analysis for task classification
  • Confidence-Based Quality Control: Adaptive learning system for model performance tracking
    • crates/runtime/src/routing/confidence.rs: Confidence monitoring and threshold management
    • Real-time quality assessment with configurable confidence thresholds
    • Automatic fallback on low-confidence responses

⚡ Performance & Reliability

  • Thread-Safe Operations: Full async/await support with proper concurrency handling
  • Error Recovery: Graceful fallback mechanisms with exponential backoff retry logic
  • Runtime Configuration: Dynamic policy updates and threshold adjustments without restart
  • Comprehensive Logging: Detailed audit trail of routing decisions and performance metrics

Improved

Routing & Model Management

  • Model Catalog Integration: Deep integration with existing model catalog for SLM selection
  • Resource Management: Intelligent resource allocation and constraint handling
  • Load Balancing: Multiple strategies for distributing requests across available models
  • Scheduler Integration: Seamless integration with the existing agent scheduler

Developer Experience

  • Comprehensive Testing: Complete test coverage for all routing components with mock implementations
  • Documentation: Extensive design documents and implementation guides
  • Configuration Validation: Enhanced validation of routing policies and model configurations

Fixed

  • Module Exports: Fixed routing module structure in crates/runtime/src/routing/mod.rs
    • Added missing pub mod config; and pub mod policy; declarations
    • Added corresponding pub use statements for proper re-exports
  • Task Type Updates: Replaced deprecated TaskType::TextGeneration with TaskType::CodeGeneration
    • Updated routing engine references throughout codebase
    • Fixed task type usage in test modules and policy evaluation
  • Import Resolution: Resolved compilation errors in routing components
    • Updated ModelLogger constructor calls to match current API
    • Fixed import paths in test modules for proper dependency resolution
  • Code Quality: Applied clippy suggestions and resolved all warnings
    • Improved code patterns and removed unused imports
    • Enhanced error handling and async operation safety

Performance Improvements

  • Routing Throughput: Optimized routing decision performance with efficient policy evaluation
  • Memory Efficiency: Reduced memory overhead in confidence monitoring and statistics tracking
  • Async Operations: Enhanced async runtime efficiency for concurrent request handling
  • Configuration Loading: Optimized configuration parsing and validation performance

Breaking Changes

  • Routing API: New routing engine interface with SLM-first architecture
  • Task Classification: Updated task type enumeration with CodeGeneration replacing TextGeneration
  • Configuration Schema: Enhanced routing configuration structure with policy-driven settings