Unofficial EfficientNetV2 pytorch implementation repository.
It contains:
- Simple Implementation of model (here)
- Pretrained Model (numpy weight, we upload numpy files converted from official tensorflow checkout point)
- Training code (here)
- Tutorial (Colab EfficientNetV2-predict tutorial, Colab EfficientNetV2-finetuning tutorial)
- Experiment results
- Tutorial
- Experiment results
- Experiment Setup
- References
Colab Tutorial
-
How to use model on colab? please check Colab EfficientNetV2-predict tutorial
-
How to train model on colab? please check Colab EfficientNetV2-finetuning tutorial
-
See how cutmix, cutout, mixup works in Colab Data augmentation tutorial
If you just want to use pretrained model, load model by torch.hub.load
import torch
model = torch.hub.load('hankyul2/EfficientNetV2-pytorch', 'efficientnet_v2_s', pretrained=True, nclass=1000)
print(model)
Available Model Names: efficientnet_v2_{s|m|l}
(ImageNet), efficientnet_v2_{s|m|l}_in21k
(ImageNet21k)
If you want to finetuning on cifar, use this repository.
-
Clone this repo and install dependency
git clone https://github.com/hankyul2/EfficientNetV2-pytorch.git pip3 install requirements.txt
-
Train & Test model (see more examples in tmuxp/cifar.yaml)
python3 main.py fit --config config/efficientnetv2_s/cifar10.yaml --trainer.gpus 2,3,
Model Name | Pretrained Dataset | Cifar10 | Cifar100 |
---|---|---|---|
EfficientNetV2-S | ImageNet | 98.46 (tf.dev, weight) | 90.05 (tf.dev, weight) |
EfficientNetV2-M | ImageNet | 98.89 (tf.dev, weight) | 91.54 (tf.dev, weight) |
EfficientNetV2-L | ImageNet | 98.80 (tf.dev, weight) | 91.88 (tf.dev, weight) |
EfficientNetV2-S-in21k | ImageNet21k | 98.50 (tf.dev, weight) | 90.96 (tf.dev, weight) |
EfficientNetV2-M-in21k | ImageNet21k | 98.70 (tf.dev, weight) | 92.06 (tf.dev, weight) |
EfficientNetV2-L-in21k | ImageNet21k | 98.78 (tf.dev, weight) | 92.08 (tf.dev, weight) |
EfficientNetV2-XL-in21k | ImageNet21k | - | - |
Note
- The results are combination of
Half precision
Super Convergence(epoch=20)
AdamW(weight_decay=0.005)
EMA(decay=0.999)
cutmix(prob=1.0)
- Changes from original paper (CIFAR)
- We just run 20 epochs to got above results. If you run more epochs, you can get more higher accuracy.
- What we changed from original setup are: optimizer(
SGD
toAdamW
), LR scheduler(cosinelr
toonecylelr
), augmentation(cutout
tocutmix
), image size (384 to 224), epoch (105 to 20). - Important hyper-parameter(most important to least important): LR->weigth_decay->ema-decay->cutmix_prob->epoch.
- you can get same results by running
tmuxp/cifar.yaml
-
Cifar setup
Category Contents Dataset CIFAR10 | CIFAR100 Batch_size per gpu (s, m, l) = (256, 128, 64) Train Augmentation image_size = 224, horizontal flip, random_crop (pad=4), CutMix(prob=1.0) Test Augmentation image_size = 224, center_crop Model EfficientNetV2 s | m | l (pretrained on in1k or in21k) Regularization Dropout=0.0, Stochastic_path=0.2, BatchNorm Optimizer AdamW(weight_decay=0.005) Criterion Label Smoothing (CrossEntropyLoss) LR Scheduler LR: (s, m, l) = (0.001, 0.0005, 0.0003), LR scheduler: OneCycle Learning Rate(epoch=20) GPUs & ETC 16 precision
EMA(decay=0.999, 0.9993, 0.9995)
S - 2 * 3090 (batch size 512)
M - 2 * 3090 (batch size 256)
L - 2 * 3090 (batch size 128)
EfficientNetV2
-
Title: EfficientNetV2: Smaller models and Faster Training
-
Author: Minxing Tan
-
Publication: ICML, 2021
-
Link: Paper | official tensorflow repo | other pytorch repo
-
Other references: