adding some more edits to code to handle synth data

Deepayan137 · Jun 29, 2020 · 85cef8a · 85cef8a
1 parent ad2e125
commit 85cef8a
Show file tree

Hide file tree

Showing 17 changed files with 453 additions and 20 deletions.
diff --git a/README.md b/README.md
@@ -1,2 +1,53 @@
 # Adapting-OCR
-Pytorch implementation of our [paper](http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/ConferencePapers/2020/AdaptingOCR_Deepayan_DAS2020_final.pdf) 
+Pytorch implementation of our [Adapting OCR with limited labels](http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/ConferencePapers/2020/AdaptingOCR_Deepayan_DAS2020_final.pdf)
+
+![](images/QualResults.png)
+
+## Dependency
+
+* This work was tested with PyTorch 1.2.0, CUDA 9.0, python 3.6 and Ubuntu 16.04.
+* requirements can be found in the file. 
+* command to create environment from the file is `conda create -n pytorch1.4 --file env.txt`
+* To activate the environment use: `source activate pytorch1.4`
+
+## Training
+
+* Supervised training 
+
+`python -m train --name exp1 --path path/to/data `
+
+* Main arguments
+	* `--name`: creates a directory where checkpoints will be stored
+	* `--path`: path to dataset. 
+	* `--imgdir`: dir name of dataset
+
+
+* Semi-supervised training
+
+`python -m train_semi_supervised --name exp1 --path path --source_dir src_dirname --target_dir tgt_dirname --schedule --noise --alpha=1`
+
+* Main arguments
+	* `--name`: creates a directory where checkpoints will be stored
+	* `--path`: path to datasets
+	* `--source_dir`: labelled data directory on which ocr was trained
+	* `--target_dir`: unlabeled data directory on which we want to adapt ocr
+	* `--percent`: percentage of unlabeled data to include in self-training
+	* `--schedule`: will include STLR scheduler while training
+	* `--train_on_pred`: will treat top-predictions as targets
+	* `--noise`: will add gaussian noise to images while training
+	* `--alpha`: set to 1 to include the mixup criterion
+	* `--combine_scoring`: will also take into account the scores outputted by a language model
+
+**Note**: `--combine_scoring` works only with line images not word images
+
+* Data 
+	* Use [trdg](https://github.com/Belval/TextRecognitionDataGenerator) to generate synthetic data. The script for data generation is included `scrips/generate_data.sh`.
+	* Download two different fonts and keep the data pertaining to each font in source and target dirs.
+	* Use one of the fonts to train data from scratch in a supervised manner.
+	* Then finetune the trained model on target data using semi-supervised learning
+	* A sample lexicon is provided in `words.txt`. Download different lexicon as per need.
+
+
+
+
+
diff --git a/env.txt b/env.txt
@@ -0,0 +1,156 @@
+# This file may be used to create an environment using:
+# $ conda create --name <env> --file <this file>
+# platform: linux-64
+_libgcc_mutex=0.1=main
+_tflow_select=2.3.0=mkl
+absl-py=0.9.0=py37hc8dfbb8_1
+astor=0.7.1=py_0
+blas=2.12=openblas
+brotlipy=0.7.0=py37h8f50634_1000
+bzip2=1.0.8=h516909a_0
+c-ares=1.15.0=h516909a_1001
+ca-certificates=2019.9.11=hecc5488_0
+cairo=1.16.0=hfb77d84_1002
+certifi=2019.9.11=py37_0
+cffi=1.12.3=py37h2e261b9_0
+chardet=3.0.4=py37hc8dfbb8_1006
+cloudpickle=1.3.0=py_0
+cryptography=2.8=py37h72c5cf5_1
+cudatoolkit=10.0.130=0
+cycler=0.10.0=py_1
+cytoolz=0.10.1=py37h7b6447c_0
+dask-core=2.10.1=py_0
+dbus=1.13.6=he372182_0
+decorator=4.4.1=py_0
+expat=2.2.5=he1b5a44_1003
+ffmpeg=4.1.3=h167e202_0
+fire=0.2.1=py_0
+fontconfig=2.13.1=h86ecdb6_1001
+freetype=2.10.0=he983fc9_1
+gast=0.2.2=py_0
+gettext=0.19.8.1=hc5be6a0_1002
+giflib=5.1.9=h516909a_0
+glib=2.58.3=h6f030ca_1002
+gmp=6.1.2=hf484d3e_1000
+gnutls=3.6.5=hd3a4fd2_1002
+graphite2=1.3.13=hf484d3e_1000
+grpcio=1.23.0=py37he9ae1f9_0
+gst-plugins-base=1.14.5=h0935bb2_0
+gstreamer=1.14.5=h36ae1b5_0
+h5py=2.10.0=nompi_py37h513d04c_102
+harfbuzz=2.4.0=h9f30f68_3
+hdf5=1.10.5=nompi_h3c11f04_1103
+icu=64.2=he1b5a44_1
+idna=2.9=py_1
+intel-openmp=2019.4=243
+jasper=1.900.1=h07fcdf6_1006
+jpeg=9c=h14c3975_1001
+kiwisolver=1.1.0=py37hc9558a2_0
+krb5=1.16.3=h05b26f9_1001
+lame=3.100=h14c3975_1001
+libblas=3.8.0=12_openblas
+libcblas=3.8.0=12_openblas
+libclang=8.0.1=hc9558a2_0
+libcurl=7.65.3=hda55be3_0
+libedit=3.1.20181209=hc058e9b_0
+libffi=3.2.1=hd88cf55_4
+libgcc-ng=9.1.0=hdf63c60_0
+libgfortran-ng=7.3.0=hdf63c60_0
+libiconv=1.15=h516909a_1005
+liblapack=3.8.0=12_openblas
+liblapacke=3.8.0=12_openblas
+libllvm8=8.0.1=hc9558a2_0
+libopenblas=0.3.7=h6e990d7_1
+libpng=1.6.37=hed695b0_0
+libprotobuf=3.8.0=h8b12597_0
+libsodium=1.0.17=h516909a_0
+libssh2=1.8.2=h22169c7_2
+libstdcxx-ng=9.1.0=hdf63c60_0
+libtiff=4.0.10=h57b8799_1003
+libuuid=2.32.1=h14c3975_1000
+libwebp=1.0.2=h576950b_1
+libxcb=1.13=h14c3975_1002
+libxkbcommon=0.8.4=h516909a_0
+libxml2=2.9.9=hea5a465_1
+libxslt=1.1.32=h31b3aaa_1004
+lxml=4.4.1=py37h7ec2d77_0
+lz4-c=1.8.3=he1b5a44_1001
+markdown=3.2.1=py_0
+matplotlib=3.1.1=py37_1
+matplotlib-base=3.1.1=py37he7580a8_1
+mkl=2019.4=243
+mkl-service=2.0.2=py37h7b6447c_0
+mkl_fft=1.0.14=py37h516909a_1
+mkl_random=1.0.4=py37hf2d7682_0
+mock=3.0.5=py37hc8dfbb8_1
+ncurses=6.1=he6710b0_1
+nettle=3.4.1=h1bed415_1002
+networkx=2.4=py_0
+ninja=1.9.0=py37hfd86e86_0
+nspr=4.20=hf484d3e_1000
+nss=3.45=he751ad9_0
+numpy=1.16.4=py37h99e49ec_0
+numpy-base=1.16.4=py37h2f8d375_0
+olefile=0.46=py37_0
+opencv=4.1.1=py37ha799480_1
+openh264=1.8.0=hdbcaa40_1000
+openssl=1.1.1c=h516909a_0
+opt_einsum=3.2.0=py_0
+pcre=8.41=hf484d3e_1003
+pillow=6.2.1=py37h34e0f95_0
+pip=19.2.2=py37_0
+pixman=0.38.0=h516909a_1003
+protobuf=3.8.0=py37he1b5a44_2
+pthread-stubs=0.4=h14c3975_1001
+pycparser=2.19=py37_0
+pyopenssl=19.1.0=py_1
+pyparsing=2.4.2=py_0
+pyqt=5.9.2=py37hcca6a23_4
+pysocks=1.7.1=py37hc8dfbb8_1
+python=3.7.4=h265db76_1
+python-dateutil=2.8.0=py_0
+python-levenshtein=0.12.0=py37h516909a_1001
+python_abi=3.7=1_cp37m
+pytorch=1.4.0=py3.7_cuda10.0.130_cudnn7.6.3_0
+pywavelets=1.1.1=py37h7b6447c_0
+pyzmq=19.0.1=py37hac76be4_0
+qt=5.9.7=h0c104cb_3
+readline=7.0=h7b6447c_5
+requests=2.23.0=pyh8c360ce_2
+scipy=1.4.1=py37habc2bb6_0
+setuptools=41.0.1=py37_0
+sip=4.19.8=py37hf484d3e_1000
+six=1.12.0=py37_0
+speechrecognition=3.6.3=py37_1000
+sqlite=3.29.0=h7b6447c_0
+tensorflow-base=1.15.0=mkl_py37he1670d9_0
+tensorflow-estimator=1.15.1=pyh2649769_0
+termcolor=1.1.0=py_2
+tk=8.6.9=hed695b0_1003
+toolz=0.10.0=py_0
+torchfile=0.1.0=py_0
+torchvision=0.5.0=py37_cu100
+tornado=6.0.3=py37h516909a_0
+tqdm=4.35.0=py_0
+urllib3=1.25.9=py_0
+visdom=0.1.8.9=0
+websocket-client=0.57.0=py37hc8dfbb8_1
+werkzeug=0.16.1=py_0
+wheel=0.33.4=py37_0
+wrapt=1.12.1=py37h8f50634_1
+x264=1!152.20180806=h14c3975_0
+xorg-kbproto=1.0.7=h14c3975_1002
+xorg-libice=1.0.10=h516909a_0
+xorg-libsm=1.2.3=h84519dc_1000
+xorg-libx11=1.6.8=h516909a_0
+xorg-libxau=1.0.9=h14c3975_0
+xorg-libxdmcp=1.1.3=h516909a_0
+xorg-libxext=1.3.4=h516909a_0
+xorg-libxrender=0.9.10=h516909a_1002
+xorg-renderproto=0.11.1=h14c3975_1002
+xorg-xextproto=7.3.0=h14c3975_1002
+xorg-xproto=7.0.31=h14c3975_1007
+xz=5.2.4=h14c3975_4
+zeromq=4.3.2=he1b5a44_2
+zlib=1.2.11=h7b6447c_3
+zstd=1.4.0=h3b9ef0a_0