-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
189 additions
and
158 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,149 @@ | ||
# PaddleOCR | ||
OCR algorithms with PaddlePaddle (still under develop) | ||
|
||
# 简介 | ||
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。 | ||
|
||
## 特性: | ||
- 超轻量级模型 | ||
- (检测模型4.1M + 识别模型4.5M = 8.6M) | ||
- 支持竖排文字识别 | ||
- (单模型同时支持横排和竖排文字识别) | ||
- 支持长文本识别 | ||
- 支持中英文数字组合识别 | ||
- 提供训练代码 | ||
- 支持模型部署 | ||
|
||
|
||
## 文档教程 | ||
- [快速安装](./doc/installation.md) | ||
- [文本识别模型训练/评估/预测](./doc/detection.md) | ||
- [文本预测模型训练/评估/预测](./doc/recognition.md) | ||
- [基于inference model预测](./doc/) | ||
|
||
### **快速开始** | ||
|
||
下载inference模型 | ||
``` | ||
# 创建inference模型保存目录 | ||
mkdir inference && cd inference && mkdir det && mkdir rec | ||
# 下载检测inference模型/ 识别 inference 模型 | ||
wget -P ./inference https://paddleocr.bj.bcebos.com/inference.tar | ||
``` | ||
|
||
实现文本检测、识别串联推理,预测$image_dir$指定的单张图像: | ||
``` | ||
export PYTHONPATH=. | ||
python tools/infer/predict_eval.py --image_dir="/Demo.jpg" --det_model_dir="./inference/det/" --rec_model_dir="./inference/rec/" | ||
``` | ||
在执行预测时,通过参数det_model_dir以及rec_model_dir设置存储inference 模型的路径。 | ||
|
||
实现文本检测、识别串联推理,预测$image_dir$指指定文件夹下的所有图像: | ||
``` | ||
python tools/infer/predict_eval.py --image_dir="/test_imgs/" --det_model_dir="./inference/det/" --rec_model_dir="./inference/rec/" | ||
``` | ||
|
||
|
||
|
||
## 文本检测算法: | ||
|
||
PaddleOCR开源的文本检测算法列表: | ||
- [x] [EAST](https://arxiv.org/abs/1704.03155) | ||
- [x] [DB](https://arxiv.org/abs/1911.08947) | ||
- [ ] [SAST](https://arxiv.org/abs/1908.05498) | ||
|
||
|
||
算法效果: | ||
|模型|骨干网络|Hmean| | ||
|-|-|-| | ||
|EAST|[ResNet50_vd](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)|85.85%| | ||
|EAST|[MobileNetV3](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)|79.08%| | ||
|DB|[ResNet50_vd](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)|83.30%| | ||
|DB|[MobileNetV3](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)|73.00%| | ||
|
||
PaddleOCR文本检测算法的训练与使用请参考[文档](./doc/detection.md)。 | ||
|
||
## 文本识别算法: | ||
|
||
PaddleOCR开源的文本识别算法列表: | ||
- [x] [CRNN](https://arxiv.org/abs/1507.05717) | ||
- [x] [DTRB](https://arxiv.org/abs/1904.01906) | ||
- [ ] [Rosetta](https://arxiv.org/abs/1910.05085) | ||
- [ ] [STAR-Net](http://www.bmva.org/bmvc/2016/papers/paper043/index.html) | ||
- [ ] [RARE](https://arxiv.org/abs/1603.03915v1) | ||
- [ ] [SRN]((https://arxiv.org/abs/2003.12294))(百度自研) | ||
|
||
算法效果如下表所示,精度指标是在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上的评测结果的平均值。 | ||
|
||
|模型|骨干网络|ACC| | ||
|-|-|-| | ||
|Rosetta|[Resnet34_vd](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)|80.24%| | ||
|Rosetta|[MobileNetV3](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)|78.16%| | ||
|CRNN|[Resnet34_vd](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)|82.20%| | ||
|CRNN|[MobileNetV3](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|79.37%| | ||
|STAR-Net|[Resnet34_vd](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|83.93%| | ||
|STAR-Net|[MobileNetV3](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|81.56%| | ||
|RARE|[Resnet34_vd](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|84.90%| | ||
|RARE|[MobileNetV3](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|83.32%| | ||
|
||
PaddleOCR文本识别算法的训练与使用请参考[文档](./doc/recognition.md)。 | ||
|
||
## TODO | ||
**端到端OCR算法** | ||
PaddleOCR即将开源百度自研端对端OCR模型[End2End-PSL](https://arxiv.org/abs/1909.07808),敬请关注。 | ||
- [ ] End2End-PSL (comming soon) | ||
|
||
|
||
|
||
# 参考文献 | ||
``` | ||
1. EAST: | ||
@inproceedings{zhou2017east, | ||
title={EAST: an efficient and accurate scene text detector}, | ||
author={Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun}, | ||
booktitle={Proceedings of the IEEE conference on Computer Vision and Pattern Recognition}, | ||
pages={5551--5560}, | ||
year={2017} | ||
} | ||
2. DB: | ||
@article{liao2019real, | ||
title={Real-time Scene Text Detection with Differentiable Binarization}, | ||
author={Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang}, | ||
journal={arXiv preprint arXiv:1911.08947}, | ||
year={2019} | ||
} | ||
3. DTRB: | ||
@inproceedings{baek2019wrong, | ||
title={What is wrong with scene text recognition model comparisons? dataset and model analysis}, | ||
author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk}, | ||
booktitle={Proceedings of the IEEE International Conference on Computer Vision}, | ||
pages={4715--4723}, | ||
year={2019} | ||
} | ||
4. SAST: | ||
@inproceedings{wang2019single, | ||
title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning}, | ||
author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming}, | ||
booktitle={Proceedings of the 27th ACM International Conference on Multimedia}, | ||
pages={1277--1285}, | ||
year={2019} | ||
} | ||
5. SRN: | ||
@article{yu2020towards, | ||
title={Towards Accurate Scene Text Recognition with Semantic Reasoning Networks}, | ||
author={Yu, Deli and Li, Xuan and Zhang, Chengquan and Han, Junyu and Liu, Jingtuo and Ding, Errui}, | ||
journal={arXiv preprint arXiv:2003.12294}, | ||
year={2020} | ||
} | ||
6. end2end-psl: | ||
@inproceedings{sun2019chinese, | ||
title={Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning}, | ||
author={Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo}, | ||
booktitle={Proceedings of the IEEE International Conference on Computer Vision}, | ||
pages={9086--9095}, | ||
year={2019} | ||
} | ||
``` |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# 可选参数列表 | ||
|
||
以下列表可以通过`--help`查看 | ||
|
||
| FLAG | 支持脚本 | 用途 | 默认值 | 备注 | | ||
| :----------------------: | :------------: | :---------------: | :--------------: | :-----------------: | | ||
| -c | ALL | 指定配置文件 | None | **配置模块说明请参考 参数介绍** | | ||
| -o | ALL | 设置配置文件里的参数内容 | None | 使用-o配置相较于-c选择的配置文件具有更高的优先级。例如:`-o Global.use_gpu=false` | | ||
|
||
|
||
## 配置文件 Global 参数介绍 | ||
|
||
| 字段 | 用途 | 默认值 | 备注 | | ||
| :----------------------: | :---------------------: | :--------------: | :--------------------: | | ||
| algorithm | 设置算法 | CRNN | 选择模型,支持模型请参考[简介]() | | ||
| use_gpu | 设置代码运行场所 | true | \ | | ||
| epoch_num | 最大训练epoch数 | 3000 | \ | | ||
| log_smooth_window | 滑动窗口大小 | 20 | \ | | ||
| print_batch_step | 设置打印log间隔 | 10 | \ | | ||
| save_model_dir | 设置模型保存路径 | output/rec_CRNN | \ | | ||
| save_epoch_step | 设置模型保存间隔 | 3 | \ | | ||
| eval_batch_step | 设置模型评估间隔 | 2000 | \ | | ||
|train_batch_size_per_card | 设置训练时单卡batch size | 256 | \ | | ||
| test_batch_size_per_card | 设置评估时单卡batch size | 256 | \ | | ||
| image_shape | 设置输入图片尺寸 | [3, 32, 100] | \ | | ||
| max_text_length | 设置文本最大长度 | 25 | \ | | ||
| character_type | 设置字符类型 | ch | en/ch, en时将使用默认dict,ch时使用自定义dict| | ||
| character_dict_path | 设置字典路径 | ./ppocr/utils/ic15_dict.txt | \ | | ||
| loss_type | 设置 loss 类型 | ctc | 支持两种loss: ctc / attention | | ||
| reader_yml | 设置reader配置文件 | ./configs/rec/rec_icdar15_reader.yml | \ | | ||
| pretrain_weights | 加载预训练模型路径 | ./pretrain_models/CRNN/best_accuracy | \ | | ||
| checkpoints | 加载模型参数路径 | None | 用于中断后重新训练 | | ||
| save_inference_dir | inference model 保存路径 | None | 用于保存inference model | | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters