Solve math problems from educational videos using OCR + AI!
VideoMath Tutor is a smart Chrome Extension + Python backend that captures paused video frames, extracts math expressions using OCR (Pix2Tex or Tesseract), and solves them using Together AI for detailed step-by-step solutions.
- ⏸ Pause any video to auto-capture math problems
- 🔍 OCR via Pix2Tex CLI (preferred) or Tesseract fallback
- 🧠 AI-powered solving using Together AI (Mixtral / LLaMA)
- 🧾 Render LaTeX in an elegant KaTeX popup
- ✂ Copy, 🌐 Search, and ✅ Solve directly from overlay
- 💡 Built-in hint engine for learning context
- 🧲 Toggle extension on/off anytime
- User pauses a video
- Content script captures the video frame
- Frame sent to FastAPI backend at
/ocr/single - OCR returns LaTeX (Pix2Tex or Tesseract)
- User sees formatted math + options (solve, copy, hints)
- On clicking Solve,
/solvesends it to Together AI - Solution is cleaned and shown inline 🎯
# Clone the repo
git clone https://github.com/jayjain4554/VideoMath-Tutor.git
cd VideoMath-Tutor/backendCreate virtual environment
python -m venv venv
source venv/bin/activate # (Windows: venv\Scripts\activate)Install dependencies
pip install -r requirements.txtRun server
uvicorn main:app --reload🧠 Note:
- Ensure you have
Tesseractinstalled and in PATH - Install Pix2Tex CLI for better OCR accuracy
- Add your Together AI API key to the environment (
TOGETHER_API_KEY)
Server runs at
http://127.0.0.1:8000
- Go to
chrome://extensions - Enable Developer Mode
- Click Load Unpacked
- Select the
extension/folder from this repo
📌 Then click the extension icon and Activate Extension from popup
VideoMath-Tutor/
├── backend/
│ ├── main.py # FastAPI app with OCR + Together AI solve
│ ├── requirements.txt # Python dependencies
│ ├── .env # TOGETHER_API_KEY (secure)
│ └── ocr_engine/ # Optional: Pix2Tex local installation
│
├── extension/
│ ├── content.js # Core script to capture & display results
│ ├── manifest.json # Chrome extension setup
│ ├── popup.html/.js/.css # UI to toggle extension
│ └── katex.min.js/.css # For math rendering
The backend uses the Together AI Inference API to solve LaTeX equations via models like:
- 🔸
mistralai/Mixtral-8x7B-Instruct-v0.1(default) - ✳️ Easy to upgrade to LLaMA 3 or GPT-NeoX
It returns clean, step-by-step solutions that are parsed into human-readable output using a custom LaTeX cleaner.
- 📢 Add LaTeX-to-speech for accessibility
- 🤖 Plug into Wolfram Alpha API for verified math
- 📈 Build a learning dashboard and insights tracker
- 🔍 Interactive step-by-step explanation viewer
Made with ❤️ by Jay Jain Feel free to contribute, suggest, or reach out for improvements.
Learn math from videos — smarter than ever. 🧠🎬➕




