about retrain shufflenetv2 question #3090

WZMIAOMIAO · 2020-12-02T02:35:51Z

First of all, thanks for your perfect projects.

Environments

pyhton: 3.7
pytorch: 1.7+cpu
torchvison: 0.8.1+cpu
system-os: ubuntu18.04

Hyperparameters

lr: 0.001
momentum: 0.9
weights_decay: 0.0001
batch_size: 16

Question introduction

Recently, I was learning the source code your provided in torchvision about shufflenetv2.
But when I was fine-training the network(only training fc layer), I had a problem that network convergence is very slow. like this:

[epoch 0] accuracy: 0.246
[epoch 1] accuracy: 0.253
[epoch 2] accuracy: 0.28
[epoch 3] accuracy: 0.305
[epoch 4] accuracy: 0.338
[epoch 5] accuracy: 0.353

I have read this document https://pytorch.org/docs/stable/torchvision/models.html#classification
According to this document, I downloaded the weights https://download.pytorch.org/models/shufflenetv2_x1-5666bf0f80.pth, and use same preprocessing method.

    data_transform = {
        "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
        "val": transforms.Compose([transforms.Resize(256),
                                   transforms.CenterCrop(224),
                                   transforms.ToTensor(),
                                   transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}

But with conditions unchanged, I just replace the model with resnet34 your provided in torchvision, and I can get great results. like this:

[epoch 0] accuracy: 0.968

Strangely, When fine-training shfflenetv2 if I change the learning rate from 0.001 to 0.1, I can get the following results:

[epoch 0] accuracy: 0.85
[epoch 1] accuracy: 0.848
.....
[epoch 29] accuracy: 0.899

Does fine-training shufflenet network need such a large learning rate?

I guess the preprocessing algorithm is not like that. Because if I use the mobilenetv2 network, I can get better results under the same conditions. Could you help me find out what's wrong? Thank you very much.

Code

https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/blob/master/pytorch_classification/Test7_shufflenet/train.py

The text was updated successfully, but these errors were encountered:

WZMIAOMIAO · 2021-01-25T00:54:55Z

@vfdev-5 Could someone answer my question?

vfdev-5 added the question label Dec 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about retrain shufflenetv2 question #3090

about retrain shufflenetv2 question #3090

WZMIAOMIAO commented Dec 2, 2020 •

edited

Loading

WZMIAOMIAO commented Jan 25, 2021 •

edited

Loading

about retrain shufflenetv2 question #3090

about retrain shufflenetv2 question #3090

Comments

WZMIAOMIAO commented Dec 2, 2020 • edited Loading

Environments

Hyperparameters

Question introduction

Code

WZMIAOMIAO commented Jan 25, 2021 • edited Loading

WZMIAOMIAO commented Dec 2, 2020 •

edited

Loading

WZMIAOMIAO commented Jan 25, 2021 •

edited

Loading