-
Notifications
You must be signed in to change notification settings - Fork 3
feat: Add GPT-5 model support with optimized parameters #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add GPT-5 model support with optimized parameters #9
Conversation
- Update LiteLLM translator to handle GPT-5 models (gpt-5, gpt-5-mini, gpt-5-nano) - Exclude unsupported parameters (temperature, top_p, logprobs) for GPT-5 models - Add GPT-5 pricing notes and documentation - Update CLAUDE.md with GPT-5 usage examples
- Add GPT-5 model detection (gpt-5, gpt-5-mini, gpt-5-nano) - Implement reasoning_effort and verbosity parameters for GPT-5 - Add accurate GPT-5 pricing and cost calculation - Update token and cost tracking using actual LiteLLM response data - Add comprehensive test suite with 18 GPT-5-specific tests - Update CLAUDE.md and README.md with GPT-5 documentation GPT-5 models automatically use optimized translation parameters: - reasoning_effort="low" (faster processing, adequate quality) - verbosity="low" (concise output without explanations) - Temperature/top_p excluded (not supported by GPT-5) Pricing (per 1M tokens): - gpt-5: $1.25/$10.00 (input/output) - gpt-5-mini: $0.25/$2.00 (recommended) - gpt-5-nano: $0.05/$0.40 (most economical) Test results: 121/121 tests passing (18 new + 103 existing) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
| if request.model == ModelType.OLLAMA: | ||
| return f"ollama/{model_name}" | ||
| elif request.model == ModelType.OPENAI: | ||
| # Check if it's a GPT-5 model | ||
| if model_name.startswith("gpt-5"): | ||
| self._is_gpt5 = True | ||
| return model_name # OpenAI models use their names directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LiteLLMTranslator sets _is_gpt5 to True when a GPT‑5 model is requested, but the flag is never cleared when a different model is used later. If a translator instance handles a GPT‑5 request followed by GPT‑4 (or any non‑GPT‑5) request, it will still send reasoning_effort/verbosity and omit temperature, parameters that the non‑GPT‑5 models do not support. This can cause subsequent translations to fail with invalid-parameter errors or run without the caller’s configured temperature. The GPT‑5 detection should be recomputed for each request (e.g., reset the flag to False before checking or derive it locally) to avoid leaking state across translations.
Useful? React with 👍 / 👎.
- Remove reasoning_effort and verbosity parameters that caused GPT-5 to produce only reasoning tokens without output content - Use max_completion_tokens instead of max_tokens for GPT-5 models - Increase token limit (2x) for GPT-5 to accommodate both reasoning and output - Add debug logging for troubleshooting translation issues - Update .gitignore to exclude local translation files and macOS files This fixes an issue where GPT-5 models would consume the entire token budget on internal reasoning without producing any translated output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add a modern, minimalist Electron-based graphical interface for Tinbox with the following features: **Frontend (Electron + React + TypeScript)** - Drag-and-drop file upload with visual feedback - Real-time translation progress via WebSocket - Batch processing queue with concurrent job management - Settings modal with secure API key storage (Electron Store) - Minimalist design with Tailwind CSS - Full TypeScript type safety **Backend (FastAPI)** - REST API server with endpoints for translation, cost estimation, and job management - WebSocket support for real-time progress updates - Job queue system with configurable concurrency limits - Model and language listing endpoints - API key validation **Key Features** - Supports PDF, DOCX, and TXT files - Cost estimation before translation - Real-time progress bars and status updates - Persistent settings and API keys - Clean, modern UI with smooth animations **Installation** - Python: `uv pip install -e ".[gui,pdf,docx]"` - Node: `cd electron && npm install` - Run: `npm run dev` See QUICKSTART_GUI.md and electron/README.md for usage instructions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
This PR adds complete support for OpenAI's GPT-5 models (gpt-5, gpt-5-mini, gpt-5-nano) with automatic parameter optimization and accurate cost tracking.
Changes
Core Implementation
reasoning_effort="low"andverbosity="low"instead of temperaturePricing (per 1M tokens)
Files Changed
src/tinbox/core/cost.py: Addedcalculate_model_cost()function and GPT-5 pricingsrc/tinbox/core/translation/litellm.py: Updated parameter handling and cost trackingtests/test_gpt5_support.py: Comprehensive test suite with 18 GPT-5-specific teststests/test_core/test_translation/test_litellm.py: Updated cost assertionsCLAUDE.md: Enhanced documentation with GPT-5 usage examplesREADME.md: Added GPT-5 models section with pricing tableUsage Examples
Testing
Technical Details
GPT-5 models don't support the
temperature,top_p, orlogprobsparameters used by GPT-4. Instead, they use:reasoning_effort: Controls thinking depth (set to "low" for faster translation)verbosity: Controls output conciseness (set to "low" to avoid explanations)The implementation automatically detects GPT-5 models and applies the correct parameters while excluding unsupported ones.