The goal of this project is to create a learning based system that takes an image of a math formula and returns corresponding LaTeX code. As a physics student I often find myself writing down Latex code from a reference image. I wanted to streamline my workflow and began looking into solutions, but besides the Freemium Mathpix I could not find anything ready-to-use that runs locally. That's why I decided to create it myself.
- Download/Clone this repository
- For now you need to install the Python dependencies specified in
requirements.txt
(look further down) - Download the
weights.pth
file from my Google Drive and place it in thecheckpoints
directory
The pix2tex.py
file offers a quick way to get the model prediction of an image. First you need to copy the formula image into the clipboard memory for example by using a snipping tool (on Windows built in Win
+Shift
+S
). Next just call the script with python pix2tex.py
. It will print out the predicted Latex code for that image and also copy it into your clipboard.
Note: As of right now it works best with fairly small images. Don't zoom in all the way before taking a picture. Also multiline equations are not working properly as they are underrepresented in the training data.
We need paired data for the network to learn. Luckily there is a lot of LaTeX code on the internet, e.g. wikipedia, arXiv. We also use the formulae from the im2latex-100k dataset.
Latin Modern Math, GFSNeohellenicMath.otf, Asana Math, XITS Math, Cambria Math
- PyTorch (tested on v1.7.0)
- Python 3.7+ & dependencies (
requirements.txt
)pip install -r requirements.txt
In order to render the math in many different fonts we use XeLaTeX, generate a PDF and finally convert it to a PNG. For the last step we need to use some third party tools:
- XeLaTeX
- ImageMagick with Ghostscript.
- Node.js to run KaTeX
de-macro
>= 1.4- Python 3.7+ & dependencies (
requirements.txt
)
- support handwritten formulae
- reduce model size
- create a standalone application
Contributions of any kind are welcome.
Code taken and modified from lucidrains, rwightman, im2markup, arxiv_leaks