Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build mindocr online doc webpage #393

Merged
merged 5 commits into from
Jun 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest
pip install cpplint
pip install -r requirements/dev.txt
# MindSpore must be installed following the instruction from official web, but not from pypi.
# That's why we exclude mindspore from requirements.txt. Does this work?
pip install "mindspore>=1.9,<=1.10"
Expand Down
28 changes: 28 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: docs
on:
push:
branches:
- main
pull_request:

permissions:
contents: write

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements/docs.txt
- name: Build site
run: mkdocs build
- name: Deploy to gh-pages
if: github.event_name == 'push' && github.ref == 'refs/heads/main' && github.repository == 'mindspore-lab/mindocr'
run: mkdocs gh-deploy --force
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ repos:
hooks:
# list of supported hooks: https://pre-commit.com/hooks.html
- id: check-yaml
args: ["--unsafe"]
- id: debug-statements
- id: end-of-file-fixer
- id: mixed-line-ending
Expand Down
17 changes: 5 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<div align="center">
<div align="center" markdown>

# MindOCR

Expand All @@ -25,7 +25,7 @@ English | [中文](README_CN.md)
MindOCR is an open-source toolbox for OCR development and application based on [MindSpore](https://www.mindspore.cn/en), which integrates series of mainstream text detection and recognition algorihtms and models and provides easy-to-use training and inference tools. It can accelerate the process of developing and deploying SoTA text detection and recognition models in real-world applications, such as DBNet/DBNet++ and CRNN/SVTR, and help fulfill the need of image-text understanding .


<details open>
<details open markdown>
<summary> Major Features </summary>

- **Modular design**: We decoupled the OCR task into several configurable modules. Users can setup the training and evaluation pipelines, customize the data processing pipeline and model architectures easily by modifying just few lines of code.
Expand Down Expand Up @@ -150,7 +150,7 @@ For more illustration and usage, please refer to the model training section in [

## Model List

<details open>
<details open markdown>
<summary>Text Detection</summary>

- [x] [DBNet](configs/det/dbnet/README.md) (AAAI'2020)
Expand All @@ -161,7 +161,7 @@ For more illustration and usage, please refer to the model training section in [

</details>

<details open>
<details open markdown>
<summary>Text Recognition</summary>

- [x] [CRNN](configs/rec/crnn/README.md) (TPAMI'2016)
Expand All @@ -179,23 +179,16 @@ For detailed support for MindSpore Lite and ACL inference models, please refer t

MindOCR provides a [dataset conversion tool](tools/dataset_converters) to OCR datasets with different formats and support customized dataset by users. We have validated the following public OCR datasets in model training/evaluation.

<details open>
<details open markdown>
<summary>General OCR Datasets</summary>

- [x] [ICDAR2015](https://rrc.cvc.uab.es/?ch=4) [[paper](https://rrc.cvc.uab.es/files/short_rrc_2015.pdf)] [[download](docs/en/datasets/icdar2015.md)]

- [x] [Total-Text](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) [[paper](https://arxiv.org/abs/1710.10400)] [[download](docs/en/datasets/totaltext.md)]

- [x] [Syntext150k](https://github.com/aim-uofa/AdelaiDet) [[paper](https://arxiv.org/abs/2002.10200)] [[download](docs/en/datasets/syntext150k.md)]

- [x] [MLT2017](https://rrc.cvc.uab.es/?ch=8&com=introduction) [[paper](https://ieeexplore.ieee.org/abstract/document/8270168)] [[download](docs/en/datasets/mlt2017.md)] (multi-language)

- [x] [MSRA-TD500](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)) [[paper](https://ieeexplore.ieee.org/abstract/document/6247787)] [[download](docs/en/datasets/td500.md)]

- [x] [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector) [[paper](https://www.sciencedirect.com/science/article/pii/S0031320319300664)] [[download](docs/en/datasets/ctw1500.md)]

- [x] [Chinese-Text-Recognition-Benchmark](https://github.com/FudanVI/benchmarking-chinese-text-recognition) [[paper](https://arxiv.org/abs/2112.15093)] [[download](https://github.com/FudanVI/benchmarking-chinese-text-recognition#download)]

</details>

We will include more datasets for training and evaluation. This list will be continuously updated.
Expand Down
18 changes: 6 additions & 12 deletions README_CN.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<div align="center">
<div align="center" markdown>

# MindOCR

Expand All @@ -25,7 +25,7 @@
MindOCR是一个基于[MindSpore](https://www.mindspore.cn/en) 框架开发的OCR开源工具箱,集成系列主流文字检测识别的算法、模型,并提供易用的训练和推理工具,可以帮助用户快速开发和应用业界SoTA文本检测、文本识别模型,如DBNet/DBNet++和CRNN/SVTR,满足图像文档理解的需求。


<details open>
<details open markdown>
<summary> 主要特性 </summary>

- **模块化设计**: MindOCR将OCR任务解耦成多个可配置模块,用户只需修改几行代码,就可以轻松地在定制化的数据和模型上配置训练、评估的全流程;
Expand Down Expand Up @@ -146,7 +146,7 @@ python tools/eval.py \

## 模型列表

<details open>
<details open markdown>
<summary>文本检测</summary>

- [x] [DBNet](configs/det/dbnet/README.md) (AAAI'2020)
Expand All @@ -156,8 +156,7 @@ python tools/eval.py \
- [ ] [FCENet](https://arxiv.org/abs/2104.10442) (CVPR'2021) [敬请期待]
</details>

<details open>

<details open markdown>
<summary>文本识别</summary>

- [x] [CRNN](configs/rec/crnn/README.md) (TPAMI'2016)
Expand All @@ -175,22 +174,17 @@ python tools/eval.py \
MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支持不同格式的OCR数据集,支持用户自定义的数据集。
当前已在模型训练评估中验证过的公开OCR数据集如下。

<details open>
<details open markdown>
<summary>通用OCR数据集</summary>

- [x] [ICDAR2015](https://rrc.cvc.uab.es/?ch=4) [[paper](https://rrc.cvc.uab.es/files/short_rrc_2015.pdf)] [[download](docs/cn/datasets/icdar2015.md)]

- [x] [Total-Text](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) [[paper](https://arxiv.org/abs/1710.10400)] [[download](docs/en/datasets/totaltext.md)]

- [x] [Syntext150k](https://github.com/aim-uofa/AdelaiDet) [[paper](https://arxiv.org/abs/2002.10200)] [[download](docs/en/datasets/syntext150k.md)]

- [x] [MLT2017](https://rrc.cvc.uab.es/?ch=8&com=introduction) [[paper](https://ieeexplore.ieee.org/abstract/document/8270168)] [[download](docs/en/datasets/mlt2017.md)] (multi-language)

- [x] [MSRA-TD500](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)) [[paper](https://ieeexplore.ieee.org/abstract/document/6247787)] [[download](docs/en/datasets/td500.md)]

- [x] [SCUT-CTW1500](https://github.com/Yuliang-Liu/Curve-Text-Detector) [[paper](https://www.sciencedirect.com/science/article/pii/S0031320319300664)] [[download](docs/en/datasets/ctw1500.md)]

- [x] [Chinese-Text-Recognition-Benchmark](https://github.com/FudanVI/benchmarking-chinese-text-recognition) [[paper](https://arxiv.org/abs/2112.15093)] [[download](https://github.com/FudanVI/benchmarking-chinese-text-recognition#download)]
</details>

我们会在更多的数据集上进行模型训练和验证。该列表将持续更新。

Expand Down
2 changes: 1 addition & 1 deletion configs/rec/rare/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ According to our experiments, the evaluation results on public benchmark dataset
| RARE | D910x4-MS1.10-G | ResNet34_vd | None | 85.19% | 3166 s/epoch | 4561 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/rare/rare_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/rare/rare_resnet34-309dc63e.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/rare/rare_resnet34-309dc63e-b65dd225.mindir) |
</div>

<details open>
<details open markdown>
<div align="center">
<summary>Detailed accuracy results for each benchmark dataset</summary>

Expand Down
2 changes: 1 addition & 1 deletion configs/rec/rare/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Table Format:
| RARE | D910x4-MS1.10-G | ResNet34_vd | 无 | 85.19% | 3166 s/epoch | 4561 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/rare/rare_resnet34.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/rare/rare_resnet34-309dc63e.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/rare/rare_resnet34-309dc63e-b65dd225.mindir) |
</div>

<details open>
<details open markdown>
<div align="center">
<summary>在各个基准数据集上的准确率</summary>

Expand Down
2 changes: 1 addition & 1 deletion configs/rec/svtr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ According to our experiments, the evaluation results on public benchmark dataset
| SVTR-Tiny | D910x4-MS1.10-G | 89.02% | 4866 s/epoch | 2968 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/svtr/svtr_tiny.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-8542b3bb.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-8542b3bb-5cf5a130.mindir) |
</div>

<details open>
<details open markdown>
<div align="center">
<summary>Detailed accuracy results for each benchmark dataset</summary>

Expand Down
2 changes: 1 addition & 1 deletion configs/rec/svtr/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Table Format:
| SVTR-Tiny | D910x4-MS1.10-G | 89.02% | 4866 s/epoch | 2968 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/svtr/svtr_tiny.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-8542b3bb.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-8542b3bb-5cf5a130.mindir) |
</div>

<details open>
<details open markdown>
<div align="center">
<summary>在各个基准数据集上的准确率</summary>

Expand Down
4 changes: 1 addition & 3 deletions docs/cn/datasets/chinese_text_recognition.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/chinese_text_recognition.md) | 中文

# 中文文字识别数据集

本文档介绍中文文本识别的数据集准备。
Expand Down Expand Up @@ -103,4 +101,4 @@ eval:
...
```

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
65 changes: 65 additions & 0 deletions docs/cn/datasets/converters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
本文档展示了如何将OCR数据集的标注文件(不包括LMDB)转换为通用格式以进行模型训练。

您也可以参考 [`convert_datasets.sh`](https://github.com/mindspore-lab/mindocr/blob/main/tools/convert_datasets.sh)。这是将给定目录下所有数据集的标注文件转换为通用格式的Shell 脚本。

要下载OCR数据集并进行格式转换,您可以参考 [Chinese text recognition](chinese_text_recognition.md), [CTW1500](ctw1500.md), [ICDAR2015](icdar2015.md), [MLT2017](mlt2017.md), [SVT](svt.md), [Syntext 150k](syntext150k.md), [TD500](td500.md), [Total Text](totaltext.md), [SynthText](synthtext.md) 的说明。

## 文本检测/端到端文本检测

转换后的标注文件格式应为:
``` text
img_61.jpg\t[{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]]}, {...}]
```

以ICDAR2015(ic15)数据集为例,要将ic15数据集转换为所需的格式,请运行:

``` shell
# convert training anotation
python tools/dataset_converters/convert.py \
--dataset_name ic15 \
--task det \
--image_dir /path/to/ic15/det/train/ch4_training_images \
--label_dir /path/to/ic15/det/train/ch4_training_localization_transcription_gt \
--output_path /path/to/ic15/det/train/det_gt.txt
```

``` shell
# convert testing anotation
python tools/dataset_converters/convert.py \
--dataset_name ic15 \
--task det \
--image_dir /path/to/ic15/det/test/ch4_test_images \
--label_dir /path/to/ic15/det/test/ch4_test_localization_transcription_gt \
--output_path /path/to/ic15/det/test/det_gt.txt
```

## 文本识别
文本识别数据集的标注格式如下:

```text
word_7.png fusionopolis
word_8.png fusionopolis
word_9.png Reserve
word_10.png CAUTION
word_11.png citi
```
请注意,图像名称和文本标签以`\t`分隔。

要转换标注文件,请运行:
``` shell
# convert training anotation
python tools/dataset_converters/convert.py \
--dataset_name ic15 \
--task rec \
--label_dir /path/to/ic15/rec/ch4_training_word_images_gt/gt.txt
--output_path /path/to/ic15/rec/train/ch4_training_word_images_gt/rec_gt.txt
```

``` shell
# convert testing anotation
python tools/dataset_converters/convert.py \
--dataset_name ic15 \
--task rec \
--label_dir /path/to/ic15/rec/ch4_test_word_images_gt/gt.txt
--output_path /path/to/ic15/rec/ch4_test_word_images_gt/rec_gt.txt
```
4 changes: 1 addition & 3 deletions docs/cn/datasets/ctw1500.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/ctw1500.md) | 中文

# SCUT-CTW1500 Datasets

## 数据下载
Expand Down Expand Up @@ -52,4 +50,4 @@ python tools/dataset_converters/convert.py \

运行后,在文件夹 `ctw1500/` 下有两个注释文件 `train_det_gt.txt` 和 `test_det_gt.txt`。

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/icdar2015.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/icdar2015.md) | 中文

# 数据集下载
ICDAR 2015 [文章](https://rrc.cvc.uab.es/?ch=4)

Expand Down Expand Up @@ -69,4 +67,4 @@ path-to-data-dir/
Challenge4_Test_Task3_GT.zip
```

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/mlt2017.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/mlt2017.md) | 中文

# 数据集下载

MLT (Multi-Lingual) 2017 [文章](https://ieeexplore.ieee.org/abstract/document/8270168)
Expand Down Expand Up @@ -63,4 +61,4 @@ path-to-data-dir/

```

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/svt.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/svt.md) | 中文

# The Street View Text Dataset (SVT)

## 数据下载
Expand Down Expand Up @@ -37,4 +35,4 @@ python tools/dataset_converters/convert.py \

运行后,在文件夹 `svt1/` 下有一个文件夹 `cropped_images/` 和一个注释文件 `rec_train_gt.txt`。

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/syntext150k.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/syntext150k.md) | 中文

# 数据集下载

SynText150k [文章](https://arxiv.org/abs/2002.10200)
Expand All @@ -22,4 +20,4 @@ path-to-data-dir/

```

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/synthtext.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/synthtext.md) | 中文

# 数据下载

SynthText是一个合成生成的数据集,其中单词实例被放置在自然场景图像中,并考虑了场景布局。
Expand Down Expand Up @@ -27,4 +25,4 @@ path-to-data-dir/
> 以上的操作会产生与`SynthText`原始标注格式相同但是是经过过滤后的标注数据.


[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/td500.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/td500.md) | 中文

# MSRA Text Detection 500 Database (MSRA-TD500)

## 数据下载
Expand Down Expand Up @@ -48,4 +46,4 @@ python tools/dataset_converters/convert.py \

运行后,在文件夹 `MSRA-TD500/` 下有两个注释文件 `train_det_gt.txt` 和 `test_det_gt.txt`。

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
4 changes: 1 addition & 3 deletions docs/cn/datasets/totaltext.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
[English](../../en/datasets/totaltext.md) | 中文

# 数据集下载

Total-Text [文章](https://arxiv.org/abs/1710.10400)
Expand Down Expand Up @@ -31,4 +29,4 @@ path-to-data-dir/

```

[返回](../../../tools/dataset_converters/README_CN.md)
[返回dataset converters](converters.md)
Loading