Skip to content

Add Seq2Seq Chinese model support #289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 18, 2023
Merged

Conversation

zhtmike
Copy link
Collaborator

@zhtmike zhtmike commented May 15, 2023

  1. Add Chinese model configure file
  2. Support pretrained weight with backbone only. Use remove_prefix=True to load pretrained model from MindOCR
  3. Support relative path of dictionary in modelArt
  4. Add Documentation of training custom dataset
  5. Merge the documentation of Chinese dataset from Add Chinese CRNN support #280

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@zhtmike zhtmike requested review from SamitHuang and hqkate May 15, 2023 09:39
@zhtmike zhtmike force-pushed the seq2seq branch 4 times, most recently from 5cb632c to 9120f86 Compare May 17, 2023 06:50
@zhtmike zhtmike changed the title Seq2Seq Chinese model configure file Add Seq2Seq Chinese model support May 17, 2023
@zhtmike zhtmike marked this pull request as ready for review May 17, 2023 07:13
@zhtmike zhtmike requested a review from HaoyangLee May 17, 2023 07:13
Copy link
Collaborator

@SamitHuang SamitHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Some improvements are needed before merge as mentioned in comments.

## 字典准备

为训练中、英文等不同语种的识别网络,用户需配置对应的字典。只有存在于字典中的字符会被模型正确预测。MindOCR现提供中、英两种字典,其中
- `英文字典`:包括大小写英文、数字和标点符号。存放于`mindocr/utils/dict/en_dict.txt`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前已训练的crnn英文识别模型(trained on MJ+ST) 所用的字典是默认的小写字母+数字,未用到这个字典?
若在自定义数据集上改用这个英文字典,其他超参如num_classes, use_space_char 也要说明如何调整。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改为新的字典,num_classes, use_space_char 说明已更新。

- `英文字典`:包括大小写英文、数字和标点符号。存放于`mindocr/utils/dict/en_dict.txt`
- `中文字典`:包括常用中文字符、大小写英文、数字和标点符号。存放于`mindocr/utils/dict/ch_dict.txt`

目前MindOCR暂未提供自定义字典配置。该功能将在新版本中推出。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可否通过指定character_dict_path,并修改num_classes,来使用自定义字典 并 微调训练?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已提供修改字典字符教学。

```yaml
...
common:
character_dict_path: &character_dict_path mindocr/utils/dict/en_dict.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处character_dict_path跟crnn_resnet34.yaml中的配置不一样,num_claases需同步调整,不然跑训练可能会出错。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个初始配置为crnn_resnet34_ch.yaml,在PR #280 , num_claases应该正确

@@ -53,7 +52,12 @@ def build_backbone(name, **kwargs):
if 'pretrained' in kwargs:
pretrained = kwargs['pretrained']
if not isinstance(pretrained, bool):
load_model(backbone, pretrained)
if remove_prefix:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可否用 load_model中已有的auto_mapping选项来自动映射?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该不行,auto_mapping会找最相近的matches,但如果想在backbone load minocr 模型的话需要把前缀backbone.去掉,差异会有9个字符


@register_backbone
def rec_resnet34(pretrained: bool = True, **kwargs):
model = RecResNet(in_channels=3, layers=34, **kwargs)

# load pretrained weights
if pretrained:
if pretrained is True:
raise NotImplementedError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议给出更详细的报错信息,如 Pretrained checkpoint for rec_resnet34 does not exist.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已更新

@zhtmike
Copy link
Collaborator Author

zhtmike commented May 18, 2023

有两个问题

  1. 英文字典和中文字典不统一。英文字典含有空格,中文字典不包含空格,什么时候用use_space_char会有混淆。
  2. num_classes应该自动产生,否则要根据字典字符数量、use_space_char 还有模型类别决定,用户用会比较困难去理解

@SamitHuang
Copy link
Collaborator

  1. 字典文件里,不应该包含空格。统一用 use_space_char来指定添加空格支持。
  2. num_classes 确实可计算出来(如设为空 / Null),自动计算。

@zhtmike
Copy link
Collaborator Author

zhtmike commented May 18, 2023

  1. 字典文件里,不应该包含空格。统一用 use_space_char来指定添加空格支持。
  2. num_classes 确实可计算出来(如设为空 / Null),自动计算。
  1. 已在英文字典去除空格。2 涉及改动较大,建议另开一个PR

word_1657.png 你好
word_1814.png cathay
```
*注意*:请将图片名和标签以 \tag 作为分隔,避免使用空格或其他分隔符。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处应为\tab\t

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改。

word_1657.png 你好
word_1814.png cathay
```
*Note*: Please separate image names and labels using \tag, and avoid using spaces or other delimiters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处应为\tab\t

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line19参考文献的引用应修改为:"#references" -> "#参考文献"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改。

backbone:
name: rec_resnet34
pretrained: https://download.mindspore.cn/toolkits/mindocr/rare/rare_resnet34-309dc63e.ckpt
remove_prefix: True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处建议添加remove_prefix参数的注释,不然用户有点难理解

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已加注释

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/下面tutorials/datasets/中更新的文档是否需要在外部文档(如主页README,或RARE模型的REAMDE)给一个链接引过来?否则用户找不到这几个文档

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RARE模型的README已提供连接,等下CRNN 的PR也会指过去。

@HaoyangLee HaoyangLee merged commit 17a03bf into mindspore-lab:main May 18, 2023
@zhtmike zhtmike deleted the seq2seq branch May 24, 2023 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants