Skip to content

Add pretrained ckpt loading (via local path or url) to support finetuning #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 4, 2023

Conversation

SamitHuang
Copy link
Collaborator

@SamitHuang SamitHuang commented May 4, 2023

…ning

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@SamitHuang SamitHuang changed the title Add pretrained ckpt loading (via local path or url) to support finetu… Add pretrained ckpt loading (via local path or url) to support finetuning May 4, 2023
@@ -27,6 +27,7 @@ model:
head:
name: CTCHead
out_channels: *num_classes
pretrained_ckpt: https://download.mindspore.cn/toolkits/mindocr/crnn/crnn_vgg7-ea7e996c.ckpt # support local path and url
Copy link
Collaborator

@hadipash hadipash May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think pretrained parameter should be versatile, i.e. able to accept bool, str, url, etc. For now, we have many different parameters that don't match each other and look confusing. For example, some of them: pretrained, ckpt_path, checkpoint_path, and now pretrained_ckpt.

I think we should stick to one pretrained parameter, because eventually it doesn't matter whether it's pretrained on ImageNet, SynthText, or local weights, the purpose of it remains the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. But we cannot map the customized architecture defined in yaml to a pretrained model if we use pretrained and accept bool input.
Indeed, it can be confusing for pretrained (for backbone only), pretrained_ckpt, and ckpt_load_path (for eval only) in one yaml file. I think renaming pretrained_ckpt to
initialize_from will help alleviate it?.

Copy link
Collaborator

@hadipash hadipash May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the custom architecture scenario, when a user sets pretrained to True we can issue a warning saying that the pretrained model was not found and asking a user to specify a path to the pretrained model instead. Then, continue model initialization without loading pretrained weights.

Moreover, it can be handy if we want to load custom weights to, for example, backbone. We can reuse pretrained parameter to specify a path to the weights instead of setting it to True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants