This is an official implementation for the paper GDB: Gated convolutions-based Document Binarization.
This repository also comprehensively collects the datasets that may be used in document binarization.
Below is a summary table of the datasets used for document binarization, along with links to download them.
- Python >= 3.7
 - torch >= 1.7.0
 - torchvision >= 0.8.0
 
Note: The pre-processing code is not provided yet. But it is on the way.
You can download the datasets from the links below and put them in the datasets_ori folder.
When evaluating performance on the DIBCO2019 dataset,
first gather all datasets except for DIBCO2019 and place them in the img and gt folders under the datasets_ori directory.
Then crop the images and ground truth images into patches (256 * 256) and place them in the img and gt folders under the datasets/DIBCO2019 directory.
Next, use the Otsu thresholding method to binaryze the images
under datasets/img and place the results in the datasets/otsu folder.
Use the Sobel operator to process the images under datasets/img
and place the results in the datasets/sobel folder.
With these preprocessing steps completed,
Pass ./datasets/img as an argument for the --dataRoot parameter in train.py and begin training.
python train.pypython test.py- Add the code for training
 - Add the code for testing
 - Add the code for pre-processing
 - Restruct the code
 - Upload the pretrained weights
 - Comprehensively collate document binarization benchmark datasets
 - Add the code for evaluating the performance of the model
 
This work is permitted for academic research purposes only. For commercial use, please contact the author.
- If this work is useful, please cite it as:
 
@article{yang2024gdb,
  title={GDB: gated convolutions-based document binarization},
  author={Yang, Zongyuan and Liu, Baolin and Xiong, Yongping and Wu, Guibin},
  journal={Pattern Recognition},
  volume={146},
  pages={109989},
  year={2024},
  publisher={Elsevier}
}