Converting handwritten formulas to LaTeX
-
References:
-
Download dataset of 85 diffirent math symbols in handwriting from Kaggle -> folder 'extracted_images'
-
Create dataset of mathematical formulas by Generator
-
Extracting 23 symbols ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '-', '+', '=', 'leq', 'neq', 'geq', 'alpha', 'beta', 'lambda', 'lt', 'gt', 'x', 'y'] from Kaggle's dataste
-
Normalization of symbols -> folder normalized
• Removing the border
• Scale to at most 40 × 40
• Center the mass in a 48 × 48 image
• Subtract mean and divide by standard deviation
-
Combining symbols to easy formula -> folder formulas
-
-
-
Available files: source
• train_images_std.npy
• train_images_mean.npy
• Latex/Latex.py
• Seq2SeqModel/Seq2SeqModel.py
-
Generate latex sequences from filenames using Sequence-to-sequence model -> output files: oseq_n.npy, iseq_n.npy, files.json
-
-> output files: drive
-
Train and test the seq2seq model: https://github.com/Wikunia/HE2LaTeX/blob/master/Seq2Seq.ipynb -> Accuracy on test set ~ 62%
-
To be continue