Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self-distillation lower mAP #921

Open
4 tasks done
sisgrad opened this issue Oct 8, 2023 · 1 comment
Open
4 tasks done

self-distillation lower mAP #921

sisgrad opened this issue Oct 8, 2023 · 1 comment
Labels
question Further information is requested

Comments

@sisgrad
Copy link

sisgrad commented Oct 8, 2023

Before Asking

  • I have read the README carefully. 我已经仔细阅读了README上的操作指引。

  • I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)

  • I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。

Search before asking

  • I have searched the YOLOv6 issues and found no similar questions.

Question

Hello. I'm finetuning yolov6m on my custom dataset, and decided to use self-distillation upon trained model checkpoint, but currently can't get any profit from it:

  • mAP:50 of coco finetuned checkpoint is 68 %
  • mAP:50 of best self-distillation experiment upon this checkpoint is 65 %

difference between mAP:50:95 is same

My experiments were:

  1. Use both only --distill or both --distill and --distill_feat options -> results with both option is slightly worse
  2. Tried to modify distillation temperature - > no any significant changes
  3. Tried to modify lr0, lrf, weight_decay -> no any significant changes
  4. Tried to switch from SGD to Adam -> much more bigger losses and poor optimization
  5. Disable some hard augmentations -> mAP becomes even worse
    6.Tried different Pretrained= options, like None / coco_checkpoint / finetuned_checkpoint -> best results with finetuned checkpoint (btw it's unclear in tutorials for training your custom data which option to use for distillation)

Maybe I do something wrong, or can it be my custom dataset "hard-to-distill" feature?
Can't find any distillation process related issues there.

Additional

No response

@sisgrad sisgrad added the question Further information is requested label Oct 8, 2023
@Chilicyy
Copy link
Collaborator

Chilicyy commented Oct 9, 2023

Hi @sisgrad , we do experiments by using only --distill, and using coco_checkpoint for pretrained model. The performance of self-distillation may be vary from different tasks, and you could try to enlarge the training epochs to improve the metrics as well. Hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants