Skip to content

๐ŸŽฎ Real-time game subtitle translator with AI-powered OCR. Context-aware translation for 20+ languages. Free offline models + dirt cheap APIs. Perfect for gaming in foreign languages!

License

Notifications You must be signed in to change notification settings

tomkam1702/OCR-Translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Game-Changing Translator

Copyright ยฉ 2025 Tomasz Kamiล„ski

Game-Changing Translator Logo

Discover why this is the world's first AI-OCR game translator with our comprehensive showcase featuring a video clip, competitive comparisons, and and all the innovative features! ๐Ÿš€

Overview

Game-Changing Translator is a powerful desktop application that automatically captures text from any area of your screen, performs optical character recognition (OCR), and translates the text in real-time. It creates floating overlay windows that can be positioned anywhere on your screen, making it perfect for translating games, videos, PDFs, or any application with text that you can't easily copy and paste.

This project was inspired by a family member who was learning French by playing games with French subtitles and needed real-time translation. I hope it will be useful both for gamers and non-gamers alike for casual on-screen translations, whether you're learning a new language through entertainment or simply need to understand content in a foreign language.

This application was developed or rather vibe-coded with the support of the following AI models: Claudeย 3.7ย Sonnet, Claudeย Sonnetย 4 and Geminiย 2.5ย Pro.

Game-Changing Translator Gallery

๐ŸŽฌ See the Magic in Action!

๐ŸŽฎ Featured Game Demonstrations

The Witcher 3 Real-Time Translation Demo
๐Ÿง™โ€โ™‚๏ธ The Witcher 3
Revolutionary AI OCR & natural Polish-to-English translation
Kingdom Come: Deliverance II Real-Time Translation Demo
โš”๏ธ Kingdom Come: Deliverance II
Czech-to-English Translation
Star Wars: The Old Republic Real-Time Translation Demo
๐ŸŒŒ Star Wars: The Old Republic
French-to-English Translation

๐Ÿš€ What These Demos Showcase:

  • โœจ Gemini AI OCR Technology handling complex backgrounds and stylised fonts that break traditional OCR engines
  • ๐Ÿง  Context-aware translation maintaining narrative coherence across different game dialogues
  • โšก Real-time performance with floating overlays that work seamlessly over any game interface
  • ๐ŸŽฏ Superior accuracy where competitors fail on challenging subtitle scenarios
  • ๐ŸŒ Multi-language support demonstrating various language pairs in gaming contexts

Experience the world's first AI-OCR game translator that actually works with difficult gaming content across multiple game genres!


Key Features

๐Ÿš€ NEW in Version 3.7.0

  • ๐ŸŽฏ DeepL Translation with Context - A Game-Changing First! ๐ŸŒŸ: Revolutionary context-aware DeepL translation that intelligently remembers up to 3 previous subtitles for dramatically improved grammar, accuracy, and natural conversational flow โœจ
    • ๐ŸŽ Completely FREE Feature - Context usage doesn't count towards DeepL's character limits! Unlimited context-aware translations at no extra cost ๐Ÿ’ฐ
    • ๐Ÿ“Š Dedicated logging system in DeepL_Translation_Long_Log.txt for detailed translation tracking and quality verification
    • ๐ŸŽฎ Perfect for dialogue-heavy games and narrative content where pronouns, grammatical agreement and meaning depend on previous context
  • Updated Gemini Model Support: Latest Gemini models with enhanced performance characteristics
    • Gemini 2.5 Flash (gemini-2.5-flash-preview-09-2025): Good translation quality with lower latency
    • Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite-preview-09-2025): Optimised for OCR operations with reduced latency
    • However, Gemini 2.0 Flash remains the recommended choice for translation quality with more than six times lower output cost than Gemini 2.5 Flash
  • Experimental Line Break Preservation: Optional feature to maintain original line breaks in OCR and translation
    • Works best with Gemini models for consistent results
    • Less reliable with GPT models and DeepL
    • Useful for specific dialogue scenes but recommended to keep disabled for most scenarios

๐Ÿš€ NEW in Version 3.6.0

  • OpenAI Logo OpenAI Integration - More Choice, More Power!: The application now includes support for OpenAI's powerful language models, offering a flexible and high-quality alternative to existing AI services.

    • Three Versatile Models: Access to a curated selection of OpenAI's best models:
      • GPT-5 Nano
      • GPT-4.1 Mini
      • GPT-4.1 Nano
    • Dual Functionality: All supported OpenAI models can be used for both high-quality Translation and advanced, vision-capable OCR, giving you the freedom to choose the best tool for the job.

    OpenAI Support Interface

    Experience the intuitive OpenAI integration interface - easily switch between models and configure both OCR and translation services with the same or different powerful API providers

    • Be sure to check out the comparison between OpenAI and Gemini models in the updated user manual.

๐Ÿš€ NEW in Version 3.5.8

  • Auto-Update System: Complete auto-update functionality accessible through "Check for Updates" button in the About tab
    • GitHub API integration for checking latest releases automatically
    • Simple one-click update process with progress tracking
    • Safe update application with user file preservation
    • Staging system for secure update downloads and installation
  • Enhanced Typography Customisation: Target Window Font Type selection allowing users to choose any Windows system font for translation display
    • Expanded beyond the previous default Arial font limitation
    • Access to complete Windows font library for personalised translation appearance
    • Improved readability options to suit individual preferences and visual requirements
  • Advanced Opacity Controls: Independent transparency management for translation windows
    • Separate background and text opacity sliders providing independent control (0.0 to 1.0 range)
    • Replaced previous fixed 0.85 opacity preset with flexible user-defined transparency levels
    • Enhanced visual integration allowing translation windows to blend seamlessly with any background content
  • Improved Subtitle Display: Enhanced formatting of conversational subtitles with proper line breaks for better readability in dialog-heavy content
    • Previous format: - How are you? - Fine.
    • New format with proper line separation:
          - How are you?
          - Fine.                 
      

๐Ÿš€ NEW in Version 3.5.7

  • Persian Language Support: Added comprehensive Persian language support to both Gemini API and Google Translate translation services

  • Right-to-Left (RTL) Text Processing: Enhanced text display formatting for RTL languages with improved punctuation positioning and proper text flow

    Persian Subtitles Example

    Experience seamless Persian subtitle translation with proper RTL text formatting - perfect for Persian media content and language learning

๐Ÿš€ NEW in Version 3.5.3

  • Multiple Gemini Models Available: Flexible model selection for OCR and translation operations with dynamic configuration
    • Gemini 2.5 Flash-Lite: Recommended for speed - optimal for fast-changing subtitles (< 1 second display time)

    • Gemini 2.0 Flash: Recommended for quality - superior OCR accuracy and most idiomatic translations for longer subtitles (1+ seconds)

    • Comprehensive OCR Testing: Curious which models handle challenging subtitles most effectively and consistently?

      Spanish OCR Test Sample

      View detailed test results showing Gemini 2.0 models' superior accuracy across multilingual content

    • Dynamic Model Configuration: Models available for selection are controlled by resources/gemini_models.csv for easy customization

    • Performance-Optimized Selection: Choose different models for OCR vs translation based on your specific needs

  • Enhanced API Integration: Migration from google.generativeai to google.genai library for improved performance and stability
  • Optimized Threading: Enhanced OCR and translation thread performance for faster processing and reduced latency

๐Ÿš€ NEW in Version 3.0.0

  • Gemini OCR - Premium Text Recognition: Revolutionary AI-powered OCR that delivers exceptional accuracy for challenging subtitle scenarios where traditional OCR engines struggle
    • Superior OCR quality with outstanding cost-to-quality ratio using advanced Gemini models

    • Challenging Screenshot Examples: See the dramatic difference in quality:

      OCR Comparison Example 1

      Tesseract OCR Result: ~ Trust me, OD tite WE loca mS
      Gemini OCR Result: Trust me, Oakmonters know a newcomer when they see one. We locals can tell.

      OCR Comparison Example 2

      Tesseract OCR Result: ' Paulie: Driv: show, Tom. Next stop's Bi the motel. 7 jj ie
      Gemini OCR Result: Paulie: Drive before the cops show, Tom. Next stop's Bill at the motel.

    • Professional Results: Handles low-contrast text, stylized fonts, and dynamic backgrounds that confuse traditional OCR

    • Cost-Effective Excellence: ~$0.00004 per subtitle screenshot - 37.5 times cheaper than Google Cloud Vision API while delivering superior results

    • Unique Gaming Translation Solution: First-of-its-kind integration of premium AI OCR with real-time game subtitle translation

    • Special Cost Estimation: Dedicated API Usage tab with rough cost estimates and comprehensive usage monitoring

  • Extended Context Window: Expanded sliding history window from 2 to 5 previous subtitles for enhanced translation quality
    • Better Context Awareness: Improved narrative coherence and grammatical consistency across longer conversations
    • Enhanced Asian Language Support: Extended context particularly beneficial for languages that rely heavily on contextual understanding

Core Features

  • Screen Area Selection: Define custom regions for text capture and translation display
  • Real-time Translation: Automatically detects and translates text as it changes
  • Multiple Translation Engines:
    • Gemini 2.5 Flash-Lite API
    • MarianMT (offline neural machine translation)
    • DeepL API
    • Google Translate API
  • Multilingual User Interface: Full support for English and Polish interface languages
  • Floating Overlays: Translucent, movable windows that stay on top of other applications
  • Customizable Appearance: Adjust colours, fonts, and transparency
  • Image Preprocessing: Various modes to improve OCR accuracy
  • Hotkey Support: Control the application without switching windows
  • Translation Caching: Reduce API calls and improve performance

Ready-to-Use Compiled Version

๐ŸŽฎ Perfect for Gamers and Non-Technical Users!

If you want to start using Game-Changing Translator immediately without installing Python or dealing with dependencies, we've prepared a ready-to-use compiled version for you:

Quick Start Options

๐Ÿ“ฅ Download from Releases

  1. Download both files.
  2. Run the .exe installer file to unpack and install the application to your preferred folder.
  3. Install Tesseract OCR (one-time requirement).
  4. Launch the application by running GameChangingTranslator.exe from your installation folder.
  5. Experience premium AI OCR and enhanced context translation! ๐Ÿค–

Need Help?

๐Ÿ“– Installation Guides:

The compiled versions include everything you need - no Python installation required!

Installation

Prerequisites

  • Windows operating system
  • Tesseract OCR installed
  • Python 3.7 or newer

Setup

  1. Clone this repository:

    git clone https://github.com/tomkam1702/OCR-Translator.git
    
  2. Install required Python packages:

    pip install -r requirements.txt
    
  3. Run the application:

    python main.py
    

For detailed installation instructions, see the Installation Guide.

Quick Start

  1. Launch the application
  2. Click "Select Source Area (OCR)" and drag to select the text area you want to translate
  3. Click "Select Target Area (Translation)" and drag to select where you want the translation to appear
  4. Configure your preferred translation method in the Settings tab
  5. Click "Start" to begin translation
  6. Press the ~ key to toggle translation on/off

For more detailed usage instructions, see the User Manual.

Documentation

Development Status

This project is considered feature-complete. Small changes may or may not be made in the future, but generally no active development is planned. If you wish to add features or make changes, the best approach is to fork the repository and develop it further yourself.

Licence

This project is free software, licensed under the GNU General Public Licence version 3 (GPLv3).

You can:

  • Use the software for any purpose
  • Change the software to suit your needs
  • Share the software and your changes with others

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY. See the LICENCE file for complete details.

Acknowledgments

Contributing

Please note that this project is considered feature-complete. If you wish to make substantial changes, please consider forking the repository instead.

โš ๏ธ FORKING NOTICE: This project requires attribution to the original author. Please read ATTRIBUTION.md before forking or using this code.

License: GPL v3 Attribution Required Original Author

About

๐ŸŽฎ Real-time game subtitle translator with AI-powered OCR. Context-aware translation for 20+ languages. Free offline models + dirt cheap APIs. Perfect for gaming in foreign languages!

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages