Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about retrain shufflenetv2 question #3090

Open
WZMIAOMIAO opened this issue Dec 2, 2020 · 1 comment
Open

about retrain shufflenetv2 question #3090

WZMIAOMIAO opened this issue Dec 2, 2020 · 1 comment
Labels

Comments

@WZMIAOMIAO
Copy link
Contributor

WZMIAOMIAO commented Dec 2, 2020

First of all, thanks for your perfect projects.

Environments

pyhton: 3.7
pytorch: 1.7+cpu
torchvison: 0.8.1+cpu
system-os: ubuntu18.04

Hyperparameters

lr: 0.001
momentum: 0.9
weights_decay: 0.0001
batch_size: 16

Question introduction

Recently, I was learning the source code your provided in torchvision about shufflenetv2.
But when I was fine-training the network(only training fc layer), I had a problem that network convergence is very slow. like this:

[epoch 0] accuracy: 0.246
[epoch 1] accuracy: 0.253
[epoch 2] accuracy: 0.28
[epoch 3] accuracy: 0.305
[epoch 4] accuracy: 0.338
[epoch 5] accuracy: 0.353

I have read this document https://pytorch.org/docs/stable/torchvision/models.html#classification
According to this document, I downloaded the weights https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth, and use same preprocessing method.

    data_transform = {
        "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
        "val": transforms.Compose([transforms.Resize(256),
                                   transforms.CenterCrop(224),
                                   transforms.ToTensor(),
                                   transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}

But with conditions unchanged, I just replace the model with resnet34 your provided in torchvision, and I can get great results. like this:

[epoch 0] accuracy: 0.968

Strangely, When fine-training shfflenetv2 if I change the learning rate from 0.001 to 0.1, I can get the following results:

[epoch 0] accuracy: 0.85
[epoch 1] accuracy: 0.848
.....
[epoch 29] accuracy: 0.899

Does fine-training shufflenet network need such a large learning rate?

I guess the preprocessing algorithm is not like that. Because if I use the mobilenetv2 network, I can get better results under the same conditions. Could you help me find out what's wrong? Thank you very much.

Code

https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/blob/master/pytorch_classification/Test7_shufflenet/train.py

@WZMIAOMIAO
Copy link
Contributor Author

WZMIAOMIAO commented Jan 25, 2021

@vfdev-5 Could someone answer my question?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants