Skip to content

Update SVTR Tiny Model #486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 12, 2023
Merged

Update SVTR Tiny Model #486

merged 6 commits into from
Jul 12, 2023

Conversation

zhtmike
Copy link
Collaborator

@zhtmike zhtmike commented Jul 6, 2023

  1. Update SVTR training data, to align with official training dataset.
  2. Update LMDB dataset generator, support 1. dropping instance with zero text; 2. dropping instance which text length is larger than the maximum number the model can handle (especially for CTC alignment); 3. label standardization (NFKD)
  3. Fix the SVTR augmentations, now all random variables should be randomized in __call__ instead of __init__
  4. Update SVTR Tiny model accuracy: 89.02% -> 90.23%, FPS: 2968 -> 4560
  5. Clear lot of warnings when model is running on Mindspore 2.0, including legacy warning of API change of nn.Dropout and ms_function
  6. Fix SVTR convolutional kernel and support dropping positional encoding in SVTR backbone

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@zhtmike zhtmike marked this pull request as ready for review July 7, 2023 03:09
@@ -38,7 +38,7 @@ According to our experiments, the evaluation results on public benchmark dataset

| **Model** | **Context** | **Avg Accuracy** | **Train T.** | **FPS** | **Recipe** | **Download** |
| :-----: | :-----------: | :--------------: | :----------: | :--------: | :--------: |:----------: |
| SVTR-Tiny | D910x4-MS1.10-G | 89.02% | 4866 s/epoch | 2968 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/svtr/svtr_tiny.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-8542b3bb.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-8542b3bb-5cf5a130.mindir) |
| SVTR-Tiny | D910x4-MS1.10-G | 90.23% | 3638 s/epoch | 4560 | [yaml](https://github.com/mindspore-lab/mindocr/blob/main/configs/rec/svtr/svtr_tiny.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-950be1c3.ckpt) \| [mindir](https://download.mindspore.cn/toolkits/mindocr/svtr/svtr_tiny-950be1c3-86ece8c8.mindir) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the PR is merging, ask @jianyunchao to delete the former ckpt/mindir files in https://download.mindspore.cn/toolkits/mindocr/svtr/

│ ├── data.mdb
│ └── lock.mdb
└── validation
├── data.mdb
└── lock.mdb
```

#### 3.1.3 Dataset Usage

Here we used the datasets under `training/` folders for training, and the union dataset `validation/` for validation. After training, we used the datasets under `evaluation/` to evaluate model accuracy.

**Training:** (total 14,442,049 samples)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the ST dataset size is changed, the number of total training samples should also change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed.

@SamitHuang
Copy link
Collaborator

Nice work. What leads to the FPS improved from 2968 to 4560?

@zhtmike
Copy link
Collaborator Author

zhtmike commented Jul 11, 2023

Nice work. What leads to the FPS improved from 2968 to 4560?

Not very sure. Seems removing some long text sample will be helpful. (CTC loss has max. length limit, too.long image will not contributed to the loss value but still cost some time in the previous setting)

scale_img = cv2.pyrDown(scale_img)
scale_img = cv2.resize(scale_img, (src_w, src_h), interpolation=get_interpolation())
return scale_img


class CVGaussianNoise(object):
def __init__(self, mean=0, var=20):
def __init__(self, mean=0, varience=20):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def __init__(self, mean=0, varience=20):
def __init__(self, mean=0, variance=20):

Do you mean variance? Same in class SVTRDeterioration

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed

@zhtmike zhtmike merged commit 7d20699 into mindspore-lab:main Jul 12, 2023
@zhtmike zhtmike deleted the svtr_update branch July 19, 2023 02:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants