-
Notifications
You must be signed in to change notification settings - Fork 311
Add simclr imagenet resnet50 benchmark #1162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
guarin
merged 72 commits into
master
from
guarin-lig-3049-add-simclr-imagenet-resnet50-benchmark
May 10, 2023
Merged
Add simclr imagenet resnet50 benchmark #1162
guarin
merged 72 commits into
master
from
guarin-lig-3049-add-simclr-imagenet-resnet50-benchmark
May 10, 2023
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add benchmarking test directory * Add unit tests for knn.py
* Improve docstring for `mean_topk_accuracy` * Add tests for `mean_topk_accuracy`
* Add tests for OnlineLinearClassifier
* Add `MetricCallback` tests * Refactor `MetricCallback` to keep track of metrics after train and validation epochs separately.
* Add tests for KNNClassifier
* Add tests for `LinearClassifier` * Pass model parameters to optimizer if model is not frozen
* Encode all views at once. This is how it is implemented in the original paper. It should slightly improve training as the model cannot use batch norm statistics to differentiate between embeddings from view 1 and 2. Not sure how much it improves though, didn't benchmark it. * Use square root learning rate scaling. This is recommended by the paper for smaller batch sizes. * Use 0.9 momentum. Forgot to add this previously. This is why the results are now much better.
* Add minimal `README.md` to benchmarks. * Add Imagenet ResNet50 SimCLR results to readme and docs.
philippmwirth
approved these changes
May 9, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
benchmarks
directoryCloses #569
How was it tested?
Runs
Initial run: https://tensorboard.dev/experiment/VD0GPdaoSLa5uonTp6k6dQ/#scalars

There seems to be an issue with the linear eval validation logic, top1 accuracy collapses after a couple epochs:
Doesn't happen for train top1 but we get max 51%. I would expect 64% val top1 (https://github.com/facebookresearch/vissl/blob/main/MODEL_ZOO.md#simclr).
Issues:
Will check accuracy with pretrained supervised model to check if there is something off with our validation logic, would expect 76% val top1 (https://github.com/facebookresearch/vissl/blob/main/MODEL_ZOO.md#supervised)
--> Got 73.1% knn val top1 and 76.3% linear eval top1 tensorboard.
Verified that ntx_ent_loss is correct for non-distributed training
Will check if there if the results differ between distributed and non-distributed training to verify that loss/backprop/sync works as expected
Other options are
Edit: Issues fixed in #1175
Final Results
Ran the benchmark again. We get now 63.2% linear eval top1 accuracy, compared to the paper which reports 62.8% for the same settings. Full results:
Training was 100 epochs with batch size 256 (=2x128). This corresponds to the first entry in this table:
