Skip to content

Commit

Permalink
Merge branch 'main' into branch_6
Browse files Browse the repository at this point in the history
  • Loading branch information
Songyuanwei authored Dec 8, 2022
2 parents f937e6a + 1826034 commit 2855a55
Show file tree
Hide file tree
Showing 97 changed files with 2,316 additions and 385 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,7 @@ Please see [configs](./configs) for the details about model performance and pret
* Augmentation
* [AutoAugment](https://arxiv.org/abs/1805.09501)
* [RandAugment](https://arxiv.org/abs/1909.13719)
* [Repeated Augmentation](https://openaccess.thecvf.com/content_CVPR_2020/papers/Hoffer_Augment_Your_Batch_Improving_Generalization_Through_Instance_Repetition_CVPR_2020_paper.pdf)
* RandErasing (Cutout)
* CutMix
* Mixup
Expand All @@ -287,10 +288,20 @@ Please see [configs](./configs) for the details about model performance and pret
* Label Smoothing
* Stochastic Depth (depends on networks)
* Dropout (depends on networks)
* Loss
* Cross Entropy (w/ class weight and auxilary logit support)
* Binary Cross Entropy (w/ class weight and auxilary logit support)
</details>

## Notes
### What is New
- 2022/12/07
1. Support lr warmup for all lr scheduling algorithms besides cosine decay.
2. Add repeated augmentation, which can be enabled by setting `--aug_repeats` to be a value larger than 1 (typically, 3 is a common choice).

- 2022/11/21
1. Add visualization for loss and acc curves
2. Support epochwise lr warmup cosine decay (previous is stepwise)
- 2022/11/09
1. Add 7 pretrained ViT models.
2. Add RandAugment augmentation.
Expand Down
15 changes: 11 additions & 4 deletions config.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ def create_parser():
group.add_argument('--drop_remainder', type=str2bool, nargs='?', const=True, default=True,
help='Determines whether or not to drop the last block whose data '
'row number is less than batch size (default=True)')
group.add_argument('--aug_repeats', type=int, default=0,
help='Number of dataset repeatition for repeated augmentation. If 0 or 1, repeated augmentation is diabled. Otherwise, repeated augmentation is enabled and the common choice is 3. (Default: 0)')

# Augmentation parameters
group = parser.add_argument_group('Augmentation parameters')
Expand Down Expand Up @@ -157,25 +159,30 @@ def create_parser():
help='Enables the Nesterov momentum (default=False)')
group.add_argument('--filter_bias_and_bn', type=str2bool, nargs='?', const=True, default=True,
help='Filter Bias and BatchNorm (default=True)')
group.add_argument('--eps', type=float, default=1e-10,
help='Term Added to the Denominator to Improve Numerical Stability (default=1e-10)')

# Scheduler parameters
group = parser.add_argument_group('Scheduler parameters')
group.add_argument('--scheduler', type=str, default='warmup_cosine_decay',
choices=['constant', 'warmup_cosine_decay', 'exponential_decay', 'step_decay', 'multi_step_decay'],
group.add_argument('--scheduler', type=str, default='cosine_decay',
choices=['constant', 'cosine_decay', 'exponential_decay', 'step_decay', 'multi_step_decay'],
help='Type of scheduler (default="warmup_consine_decay")')
group.add_argument('--lr', type=float, default=0.001,
help='learning rate (default=0.001)')
group.add_argument('--min_lr', type=float, default=1e-6,
help='The minimum value of learning rate if scheduler supports (default=None)')
group.add_argument('--warmup_epochs', type=int, default=3,
help='Warmup epochs (default=None)')
group.add_argument('--warmup_factor', type=float, default=0.0,
help='Warmup factor of learning rate (default=0.0)')
group.add_argument('--decay_epochs', type=int, default=100,
help='Decay epochs (default=None)')
group.add_argument('--decay_rate', type=float, default=0.9,
help='LR decay rate if scheduler supports')
group.add_argument('--multi_step_decay_milestones', type=list, default=[30, 60, 90],
help='list of epoch milestones for MultStepDecayLR, decay LR by decay_rate at the milestone epoch.')
group.add_argument('--stepwise_lr_sched', type=str2bool, nargs='?', const=True, default=True, help='If False, LR will be updated in the begin of each new epoch. Otherwise, update learning rate in each step. (default=False)')
help='list of epoch milestones for lr decay, which is ONLY effective for the multi_step_decay scheduler. LR will be decay by decay_rate at the milestone epoch.')
group.add_argument('--lr_epoch_stair', type=str2bool, nargs='?', const=True, default=False,
help='If True, LR will be updated in the first step of each epoch and LRs are the same in the remaining steps in the epoch. Otherwise, learning rate is updated every step dynamically. (default=False)')

# Loss parameters
group = parser.add_argument_group('Loss parameters')
Expand Down
2 changes: 1 addition & 1 deletion configs/convit/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ ConViT combines e the strengths of convolutional architectures and Vision Transf
| GPU | convit_tiny_plus | | | | | | | | |
| Ascend | convit_tiny_plus | 77.00 | 93.60 | | | 247 | | | |
| GPU | convit_small | | | | | | | | |
| Ascend | convit_small | | | | | | | | |
| Ascend | convit_small | 81.63 | 95.59 | | | 490 | | | |
| GPU | convit_small_plus | | | | | | | | |
| Ascend | convit_small_plus | | | | | | | | |
| GPU | convit_base | | | | | | | | |
Expand Down
70 changes: 70 additions & 0 deletions configs/convit/convit_small_ascend.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Copyright 2022 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================

# system config
mode: 0
distribute: True
num_parallel_workers: 8

# dataset config
dataset: 'imagenet'
data_dir: ''
shuffle: True
dataset_download: False
batch_size: 192
drop_remainder: True

# Augmentation config
image_resize: 224
scale: [0.08, 1.0]
ratio: [0.75, 1.333]
hflip: 0.5
interpolation: 'bicubic'
auto_augment: 'autoaug-mstd0.5'
re_prob: 0.1
mixup: 0.2
cutmix: 1.0
cutmix_prob: 1.0
crop_pct: 0.915
color_jitter: 0.4

# model config
model: 'convit_small'
num_classes: 1000
pretrained: False
ckpt_path: ''
keep_checkpoint_max: 10
ckpt_save_dir: './ckpt'
epoch_size: 300
dataset_sink_mode: True
amp_level: 'O2'

# loss config
loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
lr: 0.0007
min_lr: 0.000001
warmup_epochs: 40
decay_epochs: 260

# optimizer config
opt: 'adamw'
weight_decay: 0.05
loss_scale: 1024
filter_bias_and_bn: True
use_nesterov: False
2 changes: 1 addition & 1 deletion configs/convit/convit_tiny_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
lr: 0.00072
min_lr: 0.000001
warmup_epochs: 5
Expand Down
2 changes: 1 addition & 1 deletion configs/convit/convit_tiny_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
lr: 0.0005
min_lr: 0.00001
warmup_epochs: 10
Expand Down
2 changes: 1 addition & 1 deletion configs/convit/convit_tiny_plus_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
lr: 0.00072
min_lr: 0.000001
warmup_epochs: 40
Expand Down
20 changes: 3 additions & 17 deletions configs/densenet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,34 +46,20 @@ Please download the [ImageNet-1K](https://www.image-net.org/download.php) datase

```shell
# train densenet121 on 8 GPUs
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
mpirun -n 8 python train.py -c configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/imagenet
mpirun -n 8 python train.py --config configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/imagenet
```

Note that the number of GPUs/Ascends and batch size will influence the training results. To reproduce the training result at most, it is recommended to use the **same number of GPUs/Ascneds** with the same batch size.

- **Finetuning.** Here is an example for finetuning a pretrained densenet121 on CIFAR10 dataset using Momentum optimizer.

```shell
python train.py --model=densenet121 --pretrained --opt=momentum --lr=0.001 dataset=cifar10 --num_classes=10 --dataset_download
```
Note that the number of GPUs/Ascends and batch size will influence the training results. To reproduce the training result at most, it is recommended to use the **same number of GPUs/Ascends** with the same batch size.

Detailed adjustable parameters and their default value can be seen in [config.py](../../config.py).

### Validation

- To validate the trained model, you can use `validate.py`. Here is an example for densenet121 to verify the accuracy of
pretrained weights.

```shell
python validate.py --model=densenet121 --dataset=imagenet --val_split=val --pretrained
```

- To validate the model, you can use `validate.py`. Here is an example for densenet121 to verify the accuracy of your
training.

```shell
python validate.py --model=densenet121 --dataset=imagenet --val_split=val --ckpt_path='./ckpt/densenet121-best.ckpt'
python validate.py --config configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/densenet121.ckpt
```

### Deployment (optional)
Expand Down
17 changes: 2 additions & 15 deletions configs/densenet/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,28 +41,15 @@
> [configs文件夹](../../configs)中列出了mindcv套件所包含的模型的各个规格的yaml配置文件(在ImageNet数据集上训练和验证的配置)。
```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
mpirun -n 8 python train.py -c configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/imagenet
```

- 下面是使用在ImageNet上预训练的densenet121模型和Momentum优化器在CIFAR10数据集上进行微调的示例。

```shell
python train.py --model=densenet121 --pretrained --opt=momentum --lr=0.001 dataset=cifar10 --num_classes=10 --dataset_download
mpirun -n 8 python train.py --config configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/imagenet
```

详细的可调参数及其默认值可以在[config.py](../../config.py)中查看。

### 验证

- 下面是使用`validate.py`文件验证densenet121的预训练模型的精度的示例。

```shell
python validate.py --model=densenet121 --dataset=imagenet --val_split=val --pretrained
```

- 下面是使用`validate.py`文件验证densenet121的自定义参数文件的精度的示例。

```shell
python validate.py --model=densenet121 --dataset=imagenet --val_split=val --ckpt_path='./ckpt/densenet121-best.ckpt'
python validate.py --config configs/densenet/densenet_121_gpu.yaml --data_dir /path/to/imagenet --ckpt_path /path/to/densenet121.ckpt
```
2 changes: 1 addition & 1 deletion configs/densenet/densenet_121_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_121_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_161_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_161_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_169_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_169_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_201_ascend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
2 changes: 1 addition & 1 deletion configs/densenet/densenet_201_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ loss: 'CE'
label_smoothing: 0.1

# lr scheduler config
scheduler: 'warmup_cosine_decay'
scheduler: 'cosine_decay'
min_lr: 0.0
lr: 0.1
warmup_epochs: 0
Expand Down
72 changes: 72 additions & 0 deletions configs/mnasnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# MnasNet
> [MnasNet: Platform-Aware Neural Architecture Search for Mobile](https://arxiv.org/abs/1807.11626)
## Introduction
***

Designing convolutional neural networks (CNN) for mobile devices is challenging because mobile models need to be small and fast, yet still accurate. Although significant efforts have been dedicated to design and improve mobile CNNs on all dimensions, it is very difficult to manually balance these trade-offs when there are so many architectural possibilities to consider. In this paper, we propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. Unlike previous work, where latency is considered via another, often inaccurate proxy (e.g., FLOPS), our approach directly measures real-world inference latency by executing the model on mobile phones. To further strike the right balance between flexibility and search space size, we propose a novel factorized hierarchical search space that encourages layer diversity throughout the network.

![](mnasnet.png)

## Results
***

| Model | Context | Top-1 (%) | Top-5 (%) | Params (M) | Train T. | Infer T. | Download | Config | Log |
|-----------------|-----------|-------|-------|:----------:|-------|--------|---|--------|--------------|
| MnasNet-B1-0_75 | D910x8-G | 71.81 | 90.53 | 3.20 | 96s/epoch | | [model]() | [cfg]() | [log]() |
| MnasNet-B1-1_0 | D910x8-G | 74.28 | 91.70 | 4.42 | 96s/epoch | | [model]() | [cfg]() | [log]() |
| MnasNet-B1-1_4 | D910x8-G | 76.01 | 92.83 | 7.16 | 121s/epoch | | [model]() | [cfg]() | [log]() |

#### Notes
- All models are trained on ImageNet-1K training set and the top-1 accuracy is reported on the validatoin set.
- Context: GPU_TYPE x pieces - G/F, G - graph mode, F - pynative mode with ms function.

## Quick Start
***
### Preparation

#### Installation
Please refer to the [installation instruction](https://github.com/mindspore-ecosystem/mindcv#installation) in MindCV.

#### Dataset Preparation
Please download the [ImageNet-1K](https://www.image-net.org/download.php) dataset for model training and validation.

### Training

- **Hyper-parameters.** The hyper-parameter configurations for producing the reported results are stored in the yaml files in `mindcv/configs/mnasnet` folder. For example, to train with one of these configurations, you can run:

```shell
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
mpirun -n 8 python train.py -c configs/mnasnet/mnasnet0.75_gpu.yaml --data_dir /path/to/imagenet
```

Note that the number of GPUs/Ascends and batch size will influence the training results. To reproduce the training result at most, it is recommended to use the **same number of GPUs/Ascends** with the same batch size.

Detailed adjustable parameters and their default value can be seen in [config.py](../../config.py).

### Validation

- To validate the trained model, you can use `validate.py`. Here is an example for mnasnet0_75 to verify the accuracy of pretrained weights.

```shell
python validate.py
-c configs/mnasnet/mnasnet0.75_ascend.yaml
--data_dir=/path/to/imagenet
--ckpt_path=/path/to/ckpt
```

- To validate the model, you can use `validate.py`. Here is an example for mnasnet0_75 to verify the accuracy of your training.

```shell
python validate.py
-c configs/mnasnet/mnasnet0.75_ascend.yaml
--data_dir=/path/to/imagenet
--ckpt_path=/path/to/ckpt
```

### Deployment (optional)

Please refer to the deployment tutorial in MindCV.



Loading

0 comments on commit 2855a55

Please sign in to comment.