feat: Implement OpenAI-compatible Batch Processing API#25
feat: Implement OpenAI-compatible Batch Processing API#25RichardAtCT wants to merge 6 commits intomainfrom
Conversation
Fixes #19 The application was attempting to import MCPServerConfig from src.models, but it's actually defined in src.mcp_client as a dataclass. Updated the import statement to import from the correct module. Changes: - Remove MCPServerConfig from src.models import - Add MCPServerConfig to src.mcp_client import
Also fixes import issue that would cause container to crash after the MCPServerConfig fix. The import was missing the src. prefix.
Fixes incorrect parameter usage in rate_limit_endpoint decorators.
The decorator expects an endpoint category string (e.g., "general",
"chat") but was being called with numeric limit= parameters.
Changes:
- Replace @rate_limit_endpoint(limit=X) with @rate_limit_endpoint("general")
- Affects 9 endpoints: /v1/tools/* and /v1/mcp/* endpoints
- Also includes the tool_manager.py constants import fix
All import and decorator issues are now resolved. Docker container
starts successfully and all endpoints are properly configured.
Add comprehensive batch processing support with file-based persistence:
Core Features:
- OpenAI-compatible /v1/batches API endpoints
- Asynchronous background processing with FastAPI BackgroundTasks
- File-based persistence (survives server restarts)
- Sequential request processing for predictable resource usage
- JSONL format for input and output files
- Complete status tracking (validating → in_progress → completed)
New Modules:
- src/batch_manager.py: Batch job lifecycle management
- src/file_storage.py: JSONL file upload/download handling
- src/models.py: Batch-related Pydantic models (BatchRequest, BatchJob, etc.)
API Endpoints:
- POST /v1/files: Upload JSONL batch input files
- POST /v1/batches: Create batch jobs from uploaded files
- GET /v1/batches/{batch_id}: Retrieve batch status
- GET /v1/batches: List all batch jobs
- POST /v1/batches/{batch_id}/cancel: Cancel running batches
- GET /v1/files/{file_id}: Get file metadata
- GET /v1/files/{file_id}/content: Download file content
Additional:
- tests/test_batch_basic.py: Basic workflow tests
- examples/batch_example.py: Complete usage example
- Updated .env.example with batch configuration
- Updated README.md with batch API documentation
Configuration:
- 7-day file retention with automatic cleanup
- 100MB max file size
- 50,000 max requests per batch
- Configurable via environment variables
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Resolve conflicts in src/main.py imports section. Applied black formatting to batch-related files. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
Claude finished @RichardAtCT's task —— View job Code Review: OpenAI-Compatible Batch Processing APII've completed my review of this comprehensive batch processing implementation. This is a well-structured and thoughtfully designed feature. Here's my detailed feedback: ✅ StrengthsExcellent Architecture & Design
Robust Implementation
|
Enhance batch API with robust validation and timeout protection: File Validation: - Validate file extension (.jsonl required) - Check MIME type (warn if incorrect, don't reject) - Validate JSONL content format (first line must be valid JSON) - Validate UTF-8 encoding - Reject empty files - Clear error messages for all validation failures Request Timeout Handling: - Add 5-minute timeout for individual batch requests - Graceful timeout error handling with specific error codes - Timeout errors return HTTP 408 status - Separate timeout tracking in error responses Configuration: - BATCH_REQUEST_TIMEOUT_SECONDS constant (300s default) - Configurable via environment variables Testing: - test_file_extension_validation: Reject non-.jsonl files - test_empty_file_validation: Reject empty files - test_invalid_jsonl_format: Validate JSON format - test_non_utf8_file: Validate UTF-8 encoding Error Codes: - File validation: HTTP 400 with descriptive messages - Request timeout: HTTP 408 with timeout_error type - Processing errors: HTTP 500 with processing_error type 🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
Claude finished @RichardAtCT's task —— View job PR Review: OpenAI-Compatible Batch Processing APII've conducted a comprehensive review of this batch processing implementation. This is a high-quality, well-architected feature that successfully achieves its MVP goals while following best practices. Review Tasks:
✅ Major StrengthsExcellent Architecture & Design
Robust Implementation
Strong Test Coverage
|
Summary
Implements OpenAI's
/v1/batchesAPI for asynchronous batch processing of multiple chat completion requests.Closes #21
Features Implemented
Core Functionality
/v1/batchesAPI endpointsNew Modules
src/batch_manager.py- Batch job lifecycle managementsrc/file_storage.py- JSONL file upload/download handlingsrc/models.py- Batch-related Pydantic modelsAPI Endpoints
POST /v1/files- Upload JSONL batch input filesPOST /v1/batches- Create batch jobs from uploaded filesGET /v1/batches/{batch_id}- Retrieve batch status and detailsGET /v1/batches- List all batch jobsPOST /v1/batches/{batch_id}/cancel- Cancel running batchesGET /v1/files/{file_id}- Get file metadataGET /v1/files/{file_id}/content- Download file content (results)Implementation Details
Architecture Decisions (MVP Scope)
./batch_storage/Configuration
Testing
New Tests
tests/test_batch_basic.py- Core workflow testsExample Usage
examples/batch_example.py- Complete working example showing:Documentation
.env.exampleTest Plan
Breaking Changes
None - This is a new feature addition with no changes to existing APIs.
🤖 Generated with Claude Code