Skip to content

Commit

Permalink
adding some more edits to code to handle synth data
Browse files Browse the repository at this point in the history
  • Loading branch information
tumble-weed committed Jun 29, 2020
1 parent ad2e125 commit 85cef8a
Show file tree
Hide file tree
Showing 17 changed files with 453 additions and 20 deletions.
53 changes: 52 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,53 @@
# Adapting-OCR
Pytorch implementation of our [paper](http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/ConferencePapers/2020/AdaptingOCR_Deepayan_DAS2020_final.pdf)
Pytorch implementation of our [Adapting OCR with limited labels](http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/ConferencePapers/2020/AdaptingOCR_Deepayan_DAS2020_final.pdf)

![](images/QualResults.png)

## Dependency

* This work was tested with PyTorch 1.2.0, CUDA 9.0, python 3.6 and Ubuntu 16.04.
* requirements can be found in the file.
* command to create environment from the file is `conda create -n pytorch1.4 --file env.txt`
* To activate the environment use: `source activate pytorch1.4`

## Training

* Supervised training

`python -m train --name exp1 --path path/to/data `

* Main arguments
* `--name`: creates a directory where checkpoints will be stored
* `--path`: path to dataset.
* `--imgdir`: dir name of dataset


* Semi-supervised training

`python -m train_semi_supervised --name exp1 --path path --source_dir src_dirname --target_dir tgt_dirname --schedule --noise --alpha=1`

* Main arguments
* `--name`: creates a directory where checkpoints will be stored
* `--path`: path to datasets
* `--source_dir`: labelled data directory on which ocr was trained
* `--target_dir`: unlabeled data directory on which we want to adapt ocr
* `--percent`: percentage of unlabeled data to include in self-training
* `--schedule`: will include STLR scheduler while training
* `--train_on_pred`: will treat top-predictions as targets
* `--noise`: will add gaussian noise to images while training
* `--alpha`: set to 1 to include the mixup criterion
* `--combine_scoring`: will also take into account the scores outputted by a language model

**Note**: `--combine_scoring` works only with line images not word images

* Data
* Use [trdg](https://github.com/Belval/TextRecognitionDataGenerator) to generate synthetic data. The script for data generation is included `scrips/generate_data.sh`.
* Download two different fonts and keep the data pertaining to each font in source and target dirs.
* Use one of the fonts to train data from scratch in a supervised manner.
* Then finetune the trained model on target data using semi-supervised learning
* A sample lexicon is provided in `words.txt`. Download different lexicon as per need.





156 changes: 156 additions & 0 deletions env.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
_libgcc_mutex=0.1=main
_tflow_select=2.3.0=mkl
absl-py=0.9.0=py37hc8dfbb8_1
astor=0.7.1=py_0
blas=2.12=openblas
brotlipy=0.7.0=py37h8f50634_1000
bzip2=1.0.8=h516909a_0
c-ares=1.15.0=h516909a_1001
ca-certificates=2019.9.11=hecc5488_0
cairo=1.16.0=hfb77d84_1002
certifi=2019.9.11=py37_0
cffi=1.12.3=py37h2e261b9_0
chardet=3.0.4=py37hc8dfbb8_1006
cloudpickle=1.3.0=py_0
cryptography=2.8=py37h72c5cf5_1
cudatoolkit=10.0.130=0
cycler=0.10.0=py_1
cytoolz=0.10.1=py37h7b6447c_0
dask-core=2.10.1=py_0
dbus=1.13.6=he372182_0
decorator=4.4.1=py_0
expat=2.2.5=he1b5a44_1003
ffmpeg=4.1.3=h167e202_0
fire=0.2.1=py_0
fontconfig=2.13.1=h86ecdb6_1001
freetype=2.10.0=he983fc9_1
gast=0.2.2=py_0
gettext=0.19.8.1=hc5be6a0_1002
giflib=5.1.9=h516909a_0
glib=2.58.3=h6f030ca_1002
gmp=6.1.2=hf484d3e_1000
gnutls=3.6.5=hd3a4fd2_1002
graphite2=1.3.13=hf484d3e_1000
grpcio=1.23.0=py37he9ae1f9_0
gst-plugins-base=1.14.5=h0935bb2_0
gstreamer=1.14.5=h36ae1b5_0
h5py=2.10.0=nompi_py37h513d04c_102
harfbuzz=2.4.0=h9f30f68_3
hdf5=1.10.5=nompi_h3c11f04_1103
icu=64.2=he1b5a44_1
idna=2.9=py_1
intel-openmp=2019.4=243
jasper=1.900.1=h07fcdf6_1006
jpeg=9c=h14c3975_1001
kiwisolver=1.1.0=py37hc9558a2_0
krb5=1.16.3=h05b26f9_1001
lame=3.100=h14c3975_1001
libblas=3.8.0=12_openblas
libcblas=3.8.0=12_openblas
libclang=8.0.1=hc9558a2_0
libcurl=7.65.3=hda55be3_0
libedit=3.1.20181209=hc058e9b_0
libffi=3.2.1=hd88cf55_4
libgcc-ng=9.1.0=hdf63c60_0
libgfortran-ng=7.3.0=hdf63c60_0
libiconv=1.15=h516909a_1005
liblapack=3.8.0=12_openblas
liblapacke=3.8.0=12_openblas
libllvm8=8.0.1=hc9558a2_0
libopenblas=0.3.7=h6e990d7_1
libpng=1.6.37=hed695b0_0
libprotobuf=3.8.0=h8b12597_0
libsodium=1.0.17=h516909a_0
libssh2=1.8.2=h22169c7_2
libstdcxx-ng=9.1.0=hdf63c60_0
libtiff=4.0.10=h57b8799_1003
libuuid=2.32.1=h14c3975_1000
libwebp=1.0.2=h576950b_1
libxcb=1.13=h14c3975_1002
libxkbcommon=0.8.4=h516909a_0
libxml2=2.9.9=hea5a465_1
libxslt=1.1.32=h31b3aaa_1004
lxml=4.4.1=py37h7ec2d77_0
lz4-c=1.8.3=he1b5a44_1001
markdown=3.2.1=py_0
matplotlib=3.1.1=py37_1
matplotlib-base=3.1.1=py37he7580a8_1
mkl=2019.4=243
mkl-service=2.0.2=py37h7b6447c_0
mkl_fft=1.0.14=py37h516909a_1
mkl_random=1.0.4=py37hf2d7682_0
mock=3.0.5=py37hc8dfbb8_1
ncurses=6.1=he6710b0_1
nettle=3.4.1=h1bed415_1002
networkx=2.4=py_0
ninja=1.9.0=py37hfd86e86_0
nspr=4.20=hf484d3e_1000
nss=3.45=he751ad9_0
numpy=1.16.4=py37h99e49ec_0
numpy-base=1.16.4=py37h2f8d375_0
olefile=0.46=py37_0
opencv=4.1.1=py37ha799480_1
openh264=1.8.0=hdbcaa40_1000
openssl=1.1.1c=h516909a_0
opt_einsum=3.2.0=py_0
pcre=8.41=hf484d3e_1003
pillow=6.2.1=py37h34e0f95_0
pip=19.2.2=py37_0
pixman=0.38.0=h516909a_1003
protobuf=3.8.0=py37he1b5a44_2
pthread-stubs=0.4=h14c3975_1001
pycparser=2.19=py37_0
pyopenssl=19.1.0=py_1
pyparsing=2.4.2=py_0
pyqt=5.9.2=py37hcca6a23_4
pysocks=1.7.1=py37hc8dfbb8_1
python=3.7.4=h265db76_1
python-dateutil=2.8.0=py_0
python-levenshtein=0.12.0=py37h516909a_1001
python_abi=3.7=1_cp37m
pytorch=1.4.0=py3.7_cuda10.0.130_cudnn7.6.3_0
pywavelets=1.1.1=py37h7b6447c_0
pyzmq=19.0.1=py37hac76be4_0
qt=5.9.7=h0c104cb_3
readline=7.0=h7b6447c_5
requests=2.23.0=pyh8c360ce_2
scipy=1.4.1=py37habc2bb6_0
setuptools=41.0.1=py37_0
sip=4.19.8=py37hf484d3e_1000
six=1.12.0=py37_0
speechrecognition=3.6.3=py37_1000
sqlite=3.29.0=h7b6447c_0
tensorflow-base=1.15.0=mkl_py37he1670d9_0
tensorflow-estimator=1.15.1=pyh2649769_0
termcolor=1.1.0=py_2
tk=8.6.9=hed695b0_1003
toolz=0.10.0=py_0
torchfile=0.1.0=py_0
torchvision=0.5.0=py37_cu100
tornado=6.0.3=py37h516909a_0
tqdm=4.35.0=py_0
urllib3=1.25.9=py_0
visdom=0.1.8.9=0
websocket-client=0.57.0=py37hc8dfbb8_1
werkzeug=0.16.1=py_0
wheel=0.33.4=py37_0
wrapt=1.12.1=py37h8f50634_1
x264=1!152.20180806=h14c3975_0
xorg-kbproto=1.0.7=h14c3975_1002
xorg-libice=1.0.10=h516909a_0
xorg-libsm=1.2.3=h84519dc_1000
xorg-libx11=1.6.8=h516909a_0
xorg-libxau=1.0.9=h14c3975_0
xorg-libxdmcp=1.1.3=h516909a_0
xorg-libxext=1.3.4=h516909a_0
xorg-libxrender=0.9.10=h516909a_1002
xorg-renderproto=0.11.1=h14c3975_1002
xorg-xextproto=7.3.0=h14c3975_1002
xorg-xproto=7.0.31=h14c3975_1007
xz=5.2.4=h14c3975_4
zeromq=4.3.2=he1b5a44_2
zlib=1.2.11=h7b6447c_3
zstd=1.4.0=h3b9ef0a_0
Loading

0 comments on commit 85cef8a

Please sign in to comment.