Skip to content

Conversation

@tristan-mcinnis
Copy link

Summary

This PR adds complete support for OpenAI's GPT-5 models (gpt-5, gpt-5-mini, gpt-5-nano) with automatic parameter optimization and accurate cost tracking.

Changes

Core Implementation

  • GPT-5 Detection: Automatically detects GPT-5 models and applies appropriate parameters
  • Parameter Optimization: Uses reasoning_effort="low" and verbosity="low" instead of temperature
  • Cost Tracking: Implements accurate model-specific pricing for all three GPT-5 variants
  • Backward Compatibility: Maintains full compatibility with existing models (GPT-4, Claude, Ollama, etc.)

Pricing (per 1M tokens)

  • gpt-5: $1.25 / $10.00 (input/output) - Highest quality
  • gpt-5-mini: $0.25 / $2.00 (input/output) - Recommended for most translations
  • gpt-5-nano: $0.05 / $0.40 (input/output) - Most economical

Files Changed

  • src/tinbox/core/cost.py: Added calculate_model_cost() function and GPT-5 pricing
  • src/tinbox/core/translation/litellm.py: Updated parameter handling and cost tracking
  • tests/test_gpt5_support.py: Comprehensive test suite with 18 GPT-5-specific tests
  • tests/test_core/test_translation/test_litellm.py: Updated cost assertions
  • CLAUDE.md: Enhanced documentation with GPT-5 usage examples
  • README.md: Added GPT-5 models section with pricing table

Usage Examples

# Recommended for most use cases
tinbox --model openai:gpt-5-mini --to es document.pdf

# For highest quality
tinbox --model openai:gpt-5 --to fr complex_report.pdf

# For maximum speed/economy
tinbox --model openai:gpt-5-nano --to de simple_doc.txt

Testing

  • ✅ 121/121 tests passing (18 new GPT-5 tests + 103 existing)
  • ✅ All GPT-5 variants tested (gpt-5, gpt-5-mini, gpt-5-nano)
  • ✅ Parameter handling verified (reasoning_effort, verbosity)
  • ✅ Cost calculation accuracy validated
  • ✅ Backward compatibility confirmed

Technical Details

GPT-5 models don't support the temperature, top_p, or logprobs parameters used by GPT-4. Instead, they use:

  • reasoning_effort: Controls thinking depth (set to "low" for faster translation)
  • verbosity: Controls output conciseness (set to "low" to avoid explanations)

The implementation automatically detects GPT-5 models and applies the correct parameters while excluding unsupported ones.

tristan-mcinnis and others added 2 commits September 29, 2025 19:22
- Update LiteLLM translator to handle GPT-5 models (gpt-5, gpt-5-mini, gpt-5-nano)
- Exclude unsupported parameters (temperature, top_p, logprobs) for GPT-5 models
- Add GPT-5 pricing notes and documentation
- Update CLAUDE.md with GPT-5 usage examples
- Add GPT-5 model detection (gpt-5, gpt-5-mini, gpt-5-nano)
- Implement reasoning_effort and verbosity parameters for GPT-5
- Add accurate GPT-5 pricing and cost calculation
- Update token and cost tracking using actual LiteLLM response data
- Add comprehensive test suite with 18 GPT-5-specific tests
- Update CLAUDE.md and README.md with GPT-5 documentation

GPT-5 models automatically use optimized translation parameters:
- reasoning_effort="low" (faster processing, adequate quality)
- verbosity="low" (concise output without explanations)
- Temperature/top_p excluded (not supported by GPT-5)

Pricing (per 1M tokens):
- gpt-5: $1.25/$10.00 (input/output)
- gpt-5-mini: $0.25/$2.00 (recommended)
- gpt-5-nano: $0.05/$0.40 (most economical)

Test results: 121/121 tests passing (18 new + 103 existing)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Comment on lines 85 to 91
if request.model == ModelType.OLLAMA:
return f"ollama/{model_name}"
elif request.model == ModelType.OPENAI:
# Check if it's a GPT-5 model
if model_name.startswith("gpt-5"):
self._is_gpt5 = True
return model_name # OpenAI models use their names directly

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reset GPT-5 flag per request

LiteLLMTranslator sets _is_gpt5 to True when a GPT‑5 model is requested, but the flag is never cleared when a different model is used later. If a translator instance handles a GPT‑5 request followed by GPT‑4 (or any non‑GPT‑5) request, it will still send reasoning_effort/verbosity and omit temperature, parameters that the non‑GPT‑5 models do not support. This can cause subsequent translations to fail with invalid-parameter errors or run without the caller’s configured temperature. The GPT‑5 detection should be recomputed for each request (e.g., reset the flag to False before checking or derive it locally) to avoid leaking state across translations.

Useful? React with 👍 / 👎.

tristan-mcinnis and others added 2 commits November 13, 2025 23:19
- Remove reasoning_effort and verbosity parameters that caused GPT-5 to produce only reasoning tokens without output content
- Use max_completion_tokens instead of max_tokens for GPT-5 models
- Increase token limit (2x) for GPT-5 to accommodate both reasoning and output
- Add debug logging for troubleshooting translation issues
- Update .gitignore to exclude local translation files and macOS files

This fixes an issue where GPT-5 models would consume the entire token budget on internal reasoning without producing any translated output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add a modern, minimalist Electron-based graphical interface for Tinbox with the following features:

**Frontend (Electron + React + TypeScript)**
- Drag-and-drop file upload with visual feedback
- Real-time translation progress via WebSocket
- Batch processing queue with concurrent job management
- Settings modal with secure API key storage (Electron Store)
- Minimalist design with Tailwind CSS
- Full TypeScript type safety

**Backend (FastAPI)**
- REST API server with endpoints for translation, cost estimation, and job management
- WebSocket support for real-time progress updates
- Job queue system with configurable concurrency limits
- Model and language listing endpoints
- API key validation

**Key Features**
- Supports PDF, DOCX, and TXT files
- Cost estimation before translation
- Real-time progress bars and status updates
- Persistent settings and API keys
- Clean, modern UI with smooth animations

**Installation**
- Python: `uv pip install -e ".[gui,pdf,docx]"`
- Node: `cd electron && npm install`
- Run: `npm run dev`

See QUICKSTART_GUI.md and electron/README.md for usage instructions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant