-
Notifications
You must be signed in to change notification settings - Fork 60
add dbnet yaml for synthtext dataset and td500 dataset #257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
configs/det/dbnet/README.md
Outdated
|
||
| **Model** | **Context** | **Backbone** | **Pretrained** | **Recall** | **Precision** | **F-score** | **Train T.** | **Throughput** | **Recipe** | **Download** | | ||
|-------------------|----------------|--------------|----------------|------------|---------------|-------------|--------------|----------------|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| DBNet (ours) | D910x1-MS2.0-G | ResNet-50 | SynthText | 82.47% | 87.75% | 85.03% | 24.4 s/epoch | 27.9 img/s | [yaml](db_r50_td500.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50_td500-0d12b5e8.ckpt) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the Throughput for TD500 so different from that on SynthText (27.9 vs 82.02 FPS), considering the architecture and data processing pipeline should be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch_size is different. By adjusting num_workers, the current Throughput for TD500 is 51.1 img/s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要在model readme中补充synthtext和td500的data preparation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的weight decay 5e-4,跟原论文 1e-4并未对齐,原因是?(如调整后 最终loss更低?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外,net_columns_to_net为旧参数名,须对齐最新的:net_input_column_index, label_column_index 设置上。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original DBNet paper doesn't share details on the network pretraining with SynthText, it only mentions this:
For all the models, we first pre-train them with the SynthText dataset for 100k iterations.
Nor the original repo has a config for SynthText. The "Synthetic Data for Text Localisation in Natural Images" paper has pretraining hyperparameters where they mentioned weight decay 5e-4 (actually, 5-4 but I find it odd), however they used these hyperparameters to pretrain FCRN. So, I guess, it is up to us to find the best combination of hyperparameters to pretrain DBNet. Although, we must be cautious to not overfit the model since there's no validation set for SynthText.
configs/det/dbnet/README.md
Outdated
|
||
| **Model** | **Context** | **Backbone** | **Pretrained** | **Train T.** | **Throughput** | **Recipe** | **Download** | | ||
|-------------------|----------------|--------------|----------------|------------|---------------|-------------|--------------| | ||
| DBNet (ours) | D910x1-MS2.0-G | ResNet-50 | ImageNet | 10470 s/epoch | 82.02 img/s | [yaml](db_r50_synthtext.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50_synthtext-40655acb.ckpt) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yaml中distribute为True,但此处D910x1-MS2.0-G 为单卡,please double-check standalone/distributed mode and the number of cards used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add training loss result for SynthText.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
cfa13c5
to
fdb3366
Compare
Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:
Motivation
add dbnet yaml for synthtext dataset and td500 dataset
(Write your motivation for proposed changes here.)
Test Plan
(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)
Related Issues and PRs
related issue #222
(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)