A sophisticated, local-first desktop translation tool for Linux that provides real-time, context-aware translation of on-screen text. Built with PyQt6, Tesseract OCR, and LibreTranslate, LiveTrans brings professional AR-style translation capabilities to your desktop—entirely offline and privacy-respecting.
No cloud. No data leaks. Just pure local translation.
- Real-Time Translation: Seamlessly translate text visible on your screen with sub-100ms latency
- Context-Aware: Groups spatially-related words before translation to preserve meaning (critical for languages like Japanese)
- Multi-Language Support: English, French, German, Spanish, Japanese, and more
- Privacy-First: All processing happens locally—your screen data never leaves your machine
- Universal Compatibility: Works on any Linux window (VS Code, browsers, PDF readers, terminals, etc.)
- Hierarchical Layout Analysis: YOLO-like spatial mapping clusters words into semantic blocks
- Recursive Feedback Prevention: Intelligent "blink-and-capture" prevents the overlay from translating itself
- Optimized Image Processing: Grayscale conversion and 2x upscaling for pixel-perfect OCR accuracy
- Multi-Threaded Pipeline: Separate UI and worker threads keep your interface buttery-smooth
- Smart Caching: Translations are cached to minimize redundant processing and CPU usage
- Coordinate Remapping: Translatable text boxes land exactly where the original text appears
First, update your package manager and install the required system libraries:
sudo apt-get update
sudo apt-get install -y python3-venv python3-dev build-essential cmake tesseract-ocr libtesseract-dev tesseract-ocr-eng tesseract-ocr-fra tesseract-ocr-deu tesseract-ocr-spa tesseract-ocr-jpnLiveTrans requires LibreTranslate running as a background service. Start it in a separate terminal before launching the application.
Quick Start (Japanese & English only):
libretranslate --load-only ja,en --port 5000This will download the required language models on first run.
Recommended (Multi-Language Support): For broader language support, use:
libretranslate --load-only ja,en,zh,ko,fr,es --port 5000This loads models for: Japanese, English, Chinese, Korean, French, and Spanish.
Note: The first run will download language models (may take a few minutes depending on your internet speed). Subsequent runs will start quickly.
-
Clone or navigate to the project directory:
cd /path/to/Linux-Translator -
Create a Python virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
source venv/bin/activate -
Install Python dependencies:
pip install -r requirements.txt
Ensure LibreTranslate is running in a separate terminal before launching the application:
# Terminal 1: Start LibreTranslate (runs indefinitely)
libretranslate --load-only ja,en,zh,ko,fr,es --port 5000Once LibreTranslate is active and the virtual environment is activated, start LiveTrans in another terminal:
# Terminal 2: Start LiveTrans
source venv/bin/activate
python src/main.pyA translucent overlay window will appear on your screen, ready to capture and translate on-screen text.
-
Ensure LibreTranslate is Running:
- Keep it running in a separate terminal (it will run until you stop it)
- You should see output like "WARNING: Running in debug mode. This is not recommended for production."
-
Launch the Application:
source venv/bin/activate python src/main.py -
Position the Overlay:
- Click and drag to move the translation window
- Click and drag the edges to resize it to the area you want to monitor
-
Select Source & Target Languages:
- Use the dropdown menus to choose your source language (e.g., Japanese)
- Choose your target language (e.g., English)
-
Watch the Magic:
- The application continuously captures your selected area
- Text is detected via OCR (Tesseract)
- Translations appear in real-time within the overlay
- Hover over translations to see original text (if applicable)
-
Stop the Application:
- Close the overlay window or press
Ctrl+Cin the terminal - Keep LibreTranslate running in its terminal for future use (or stop it with
Ctrl+Cwhen done)
- Close the overlay window or press
┌─────────────────────────────────────┐
│ Main Thread (UI / PyQt6) │
│ • Manages ViewportWindow │
│ • Receives mouse drag/resize input │
│ • Draws translated overlay │
│ • Emits frame_captured signal │
└────────────────┬────────────────────┘
│ (signals)
▼
┌─────────────────────────────────────┐
│ Worker Thread (OCR/Translation) │
│ • Awaits frame_captured signal │
│ • Performs Tesseract OCR │
│ • Groups words by spatial layout │
│ • Calls LibreTranslate │
│ • Emits finished signal │
└─────────────────────────────────────┘
- PyQt6 window displaying the translation overlay
- Handles mouse interactions for window movement/resizing
- Triggers the "blink" mechanism to prevent feedback loops
- Leverages Tesseract's
image_to_data()for word-level detection - Performs hierarchical layout analysis by grouping words via block/paragraph/line IDs
- Applies pre-processing: grayscale conversion and 2x cubic upscaling
- Remaps coordinates from upscaled image back to original dimension
- Manages LibreTranslate API calls
- Caches translations to avoid redundant processing
- Joins words contextually (no spaces for CJK languages, spaces for others)
- Orchestrates the multi-threaded workflow
- Manages signal/slot connections between UI and Worker threads
- Implements the "blink-and-capture" mechanism
To prevent the overlay from translating itself (recursive feedback):
- UI sets
hide_temp = True→ overlay becomes invisible - UI repaints the screen (now clean, showing only source text)
- mss takes a screenshot of the pure original text
- UI sets
hide_temp = False→ overlay reappears - All this happens in ~50 milliseconds—imperceptible to humans but clean for the OCR engine
Instead of treating text as individual words, we group them by:
- block_num: Major content blocks
- par_num: Paragraphs within blocks
- line_num: Lines within paragraphs
This ensures that Japanese text "日本語" is translated as a single semantic unit, not three separate characters with different meanings.
Original Screen Capture
▼
[Grayscale Conversion] → Remove color noise
▼
[2x Cubic Upscaling] → Sharpen character edges
▼
[Tesseract OCR] → Detect words + coordinates
▼
[Coordinate Remap] → Convert back to original scale (÷ 2)
▼
[Layout Analysis] → Group by block/paragraph/line
▼
[LibreTranslate] → Translate grouped words
▼
[Cache & Display] → Show translation in overlay
OCR Languages (via Tesseract):
- English (
eng) - French (
fra) - German (
deu) - Spanish (
spa) - Japanese (
jpn) - Additional languages can be installed
Translation Languages (via LibreTranslate):
- English, French, German, Spanish, Japanese, Chinese, Korean, Russian, Portuguese, Italian, Dutch, Polish, Turkish, and more
Edit the source files to customize:
- Overlay opacity and colors in
viewport.py - OCR processing parameters (upscale factor, grayscale) in
ocr.py - Translation cache size and source mapping in
translator.py - Capture interval and thread priorities in
main.py
- ✅ Zero Cloud Dependency: All processing is local
- ✅ No Data Transmission: Your screen data never leaves your machine
- ✅ Self-Hosted Translation Engine: LibreTranslate runs locally on
localhost:5000 - ✅ Open Source: Inspect the code yourself
- ✅ Offline-Capable: After models are downloaded, works entirely offline
Data Flow: Screenshots → Tesseract OCR (local) → LibreTranslate (local) → UI Display. No external API calls.
| Metric | Value |
|---|---|
| Frame Capture Latency | ~20-50ms |
| OCR Processing | ~50-150ms (depends on text density) |
| Translation Round-Trip | ~100-300ms (cached: <1ms) |
| UI Responsiveness | Real-time (separate thread) |
| Memory Footprint | ~150-300 MB (varies by overlay size) |
| CPU Usage | Minimal when idle; scales with text density |
libretranslate --load-only ja,en,zh,ko,fr,es --port 5000Ensure this command is running in a separate terminal. If the port is already in use:
libretranslate --load-only ja,en,zh,ko,fr,es --port 5001(Then update the port in translator.py accordingly)
Ensure you installed the system dependencies:
sudo apt-get install tesseract-ocr libtesseract-dev- Increase the upscale factor in
ocr.py(default is 2x) - Ensure the captured area has good contrast
- Try adjusting the grayscale threshold
- Check that PyQt6 is installed:
pip install PyQt6 - Verify your display server supports the windowing system
- Verify LibreTranslate is running:
curl http://localhost:5000/(should show HTML response) - Check that the port matches your configuration
- Review logs in the console for error messages
- Ensure required language models are loaded
- Reduce the size of the monitored area
- Disable features temporarily to isolate the bottleneck
- Check system RAM availability
- Ensure LibreTranslate isn't overloaded with requests
- Vertical Text Support: Toggle for vertical OCR (ideal for manga/traditional Japanese literature)
- Smart Inpainting: Replace black boxes with OpenCV inpaint/blur using surrounding colors
- Alpha-Blended Boxes: Transparent or rounded-corner styling for modern aesthetics
- Custom Styling: User-configurable fonts, colors, and background effects
- Multi-Window Support: Translate multiple windows simultaneously
- Hotkey Bindings: Keyboard shortcuts for language switching and pause/resume
To extend LiveTrans:
- Set up your development environment (see Installation)
- Review the architecture in Architecture Overview
- Make your changes in the appropriate module (
ocr.py,translator.py,viewport.py, etc.) - Test with the provided test suite:
python -m pytest tests/ - Submit improvements or bug reports!
See LICENSE for details.
- Tesseract OCR: Open-source optical character recognition engine
- LibreTranslate: Free, open-source machine translation API
- mss: Lightweight screen capture library
- PyQt6: Cross-platform GUI toolkit
Encountered an issue? Here's how to debug:
-
Enable verbose logging:
python src/main.py --verbose
-
Check dependency versions:
pip list tesseract --version
-
Inspect the logs: Check console output for detailed error messages
-
Review test cases: See
tests/for examples of OCR and translation workflows
Happy translating! 🌍