Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GENet的精度复现问题 #3

Closed
pawopawo opened this issue Aug 9, 2020 · 5 comments
Closed

GENet的精度复现问题 #3

pawopawo opened this issue Aug 9, 2020 · 5 comments

Comments

@pawopawo
Copy link

pawopawo commented Aug 9, 2020

看到您的GENet,很感兴趣,想复现一下论文的结果,但是发现论文的训练细节不是特别清楚。 我用batch size 1024,lr 0.5,weight decay 1e-4,epochs 360, 5个epochs的 warmup,cosine 学习率衰减,无dropout, GENet-normal结构的精度只训练到了76.1。

想咨询一下GENet-normal结构的训练策略是怎么样的,比如 lr,batch size,weight decay ,dropout rate,epochs,学习率的衰减策略,以及是否用了warm up。盼望得到您的帮助~

@MingLin-home
Copy link
Collaborator

We will update our draft this week to include more detailed training parameters. We use cosine lr decay, warm-up 5 epochs, wd is 4e-5, lr=0.1, batch size 256.

@pawopawo
Copy link
Author

pawopawo commented Sep 3, 2020

请问蒸馏对 论文的结果带来了多大的提升?

@MingLin-home
Copy link
Collaborator

The main purpose of teacher network is to help the student network escape the bad local minima. There is about 1% accuracy drop without the help of teacher network in the early training stages.

@pawopawo
Copy link
Author

pawopawo commented Sep 6, 2020

The main purpose of teacher network is to help the student network escape the bad local minima. There is about 1% accuracy drop without the help of teacher network in the early training stages.

所以蒸馏对最终精度没影响?只是收敛的更快了?

@MingLin-home
Copy link
Collaborator

The main purpose of teacher network is to help the student network escape the bad local minima. There is about 1% accuracy drop without the help of teacher network in the early training stages.

所以蒸馏对最终精度没影响?只是收敛的更快了?

Without teacher network, the training will quickly get stuck around 60 epochs. With teacher network, the accuracy will keep increasing as you train longer. It seems that what teacher network you use is not important, which is wired to us too.

MingLin-home added a commit that referenced this issue Oct 7, 2020
merge from idstcv to minglin-home
MingLin-home added a commit that referenced this issue Oct 7, 2020
Merge pull request #3 from idstcv/master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants