Skip to content
/ TATT Public
forked from mjq11302010044/TATT

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

License

Notifications You must be signed in to change notification settings

Tinyriver/TATT

 
 

Repository files navigation

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

https://arxiv.org/abs/2203.09388

Jianqi Ma, Zhetong Liang, Lei Zhang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China & OPPO Research

Recovering TextZoom samples

TATT visualization

Environment:

python pytorch cuda numpy

Other possible python packages like pyyaml, cv2, Pillow and imgaug

Main idea

The pipeline

TP Interpreter

Configure your training

Download the pretrained recognizer from:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

Unzip the codes and walk into the 'TATT_ROOT/', place the pretrained weights from recognizer in 'TATT_ROOT/'.

Download the TextZoom dataset:

https://github.com/JasonBoy1/TextZoom

Train the corresponding model (e.g. TPGSR-TSRN):

chmod a+x train_TATT.sh
./train_TATT.sh

Run the test-prefixed shell to test the corresponding model.

Adding '--go_test' in the shell file

Cite this paper:

@article{ma2021text,
title={A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution},
author={Ma, Jianqi and Zhetong, Liang and Zhang, Lei},
journal={},
year={2022}
}

About

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%