[Improve] Update config and README of CAIN (open-mmlab#906)

* [Improve] Update config and README of CAIN * Update * Update
9vivian88 · May 31, 2022 · d07b04b · d07b04b
1 parent 7f4fea6
commit d07b04b
Show file tree

Hide file tree

Showing 5 changed files with 52 additions and 31 deletions.
diff --git a/configs/video_interpolators/cain/README.md b/configs/video_interpolators/cain/README.md
@@ -18,13 +18,13 @@ Prevailing video frame interpolation techniques rely heavily on optical flow est
 
 ## Results and models
 
-Evaluated on Y channels.
+Evaluated on RGB channels.
 The metrics are `PSNR / SSIM` .
 The learning rate adjustment strategy is `Step LR scheduler with min_lr clipping`.
 
-|                                            Method                                            | vimeo-90k-triple |                                                                                                                         Download                                                                                                                         |
-| :------------------------------------------------------------------------------------------: | :--------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
-| [cain_b5_320k_vimeo-triple](/configs/video_interpolators/cain/cain_b5_320k_vimeo-triplet.py) |   34.49/0.9565   | [model](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_320k_vimeo-triple_20220117-647f3de2.pth)/[log](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_320k_vimeo-triple_20220117-647f3de2.log.json) |
+|                                                Method                                                 | vimeo-90k-triplet |                                                                                                                              Download                                                                                                                              |
+| :---------------------------------------------------------------------------------------------------: | :---------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [cain_b5_g1b32_vimeo90k_triplet](/configs/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet.py) | 34.6010 / 0.9578  | [model](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth)/[log](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.log.json) |
 
 ## Citation
 

diff --git a/configs/video_interpolators/cain/README_zh-CN.md b/configs/video_interpolators/cain/README_zh-CN.md
@@ -21,10 +21,10 @@
 
 <br/>
 
-在 Y 通道上进行评估。
+在 RGB 通道上进行评估。
 我们使用 `PSNR` 和 `SSIM` 作为指标。
 学习率调整策略是等间隔调整策略。
 
-|                                              算法                                               | vimeo-90k-triple |                                                                                                                          下载                                                                                                                          |
-| :-------------------------------------------------------------------------------------------: | :--------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
-| [cain_b5_320k_vimeo-triplet](/configs/video_interpolators/cain/cain_b5_320k_vimeo-triplet.py) |   34.49/0.9565   | [模型](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_320k_vimeo-triple_20220117-647f3de2.pth)/[日志](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_320k_vimeo-triple_20220117-647f3de2.log.json) |
+|                                                  算法                                                   | vimeo-90k-triplet |                                                                                                                               下载                                                                                                                               |
+| :---------------------------------------------------------------------------------------------------: | :---------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| [cain_b5_g1b32_vimeo90k_triplet](/configs/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet.py) | 34.6010 / 0.9578  | [模型](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth)/[日志](https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.log.json) |
diff --git a/...lators/cain/cain_b5_320k_vimeo-triplet.py → ...rs/cain/cain_b5_g1b32_vimeo90k_triplet.py b/...lators/cain/cain_b5_320k_vimeo-triplet.py → ...rs/cain/cain_b5_g1b32_vimeo90k_triplet.py
@@ -1,4 +1,4 @@
-exp_name = 'cain_b5_320k_vimeo-triplet'
+exp_name = 'cain_b5_g1b32_vimeo90k_triplet'
 
 # model settings
 model = dict(
@@ -7,7 +7,7 @@
     pixel_loss=dict(type='L1Loss', loss_weight=1.0, reduction='mean'))
 # model training and testing settings
 train_cfg = None
-test_cfg = dict(metrics=['PSNR', 'SSIM'], crop_border=0, convert_to='y')
+test_cfg = dict(metrics=['PSNR', 'SSIM'], crop_border=0)
 
 # dataset settings
 train_dataset_type = 'VFIVimeo90KDataset'
@@ -26,7 +26,6 @@
         key='target',
         channel_order='rgb',
         backend='pillow'),
-    dict(type='RescaleToZeroOne', keys=['inputs', 'target']),
     dict(type='FixedCrop', keys=['inputs', 'target'], crop_size=(256, 256)),
     dict(
         type='Flip',
@@ -38,7 +37,16 @@
         keys=['inputs', 'target'],
         flip_ratio=0.5,
         direction='vertical'),
+    dict(
+        type='ColorJitter',
+        keys=['inputs', 'target'],
+        channel_order='rgb',
+        brightness=0.05,
+        contrast=0.05,
+        saturation=0.05,
+        hue=0.05),
     dict(type='TemporalReverse', keys=['inputs'], reverse_ratio=0.5),
+    dict(type='RescaleToZeroOne', keys=['inputs', 'target']),
     dict(type='FramesToTensor', keys=['inputs']),
     dict(type='ImageToTensor', keys=['target']),
     dict(
@@ -81,9 +89,9 @@
     dict(type='Collect', keys=['inputs'], meta_keys=['inputs_path', 'key'])
 ]
 
-root_dir = 'data/vimeo_triple'
+root_dir = 'data/vimeo_triplet'
 data = dict(
-    workers_per_gpu=4,
+    workers_per_gpu=32,
     train_dataloader=dict(samples_per_gpu=32, drop_last=True),
     val_dataloader=dict(samples_per_gpu=1),
     test_dataloader=dict(samples_per_gpu=1),
@@ -114,20 +122,33 @@
         test_mode=True),
 )
 
-# optimizer
-optimizers = dict(generator=dict(type='Adam', lr=1e-4, betas=(0.9, 0.99)))
-
 # learning policy
-total_iters = 320000
+# 1604 iters == 1 epoch
+total_iters = 288700
 lr_config = dict(
-    policy='Step', by_epoch=False, step=[80000, 160000, 240000], gamma=0.5)
+    policy='Reduce',
+    by_epoch=False,
+    mode='max',
+    val_metric='PSNR',
+    epoch_base_valid=True,  # Support epoch base valid in iter base runner.
+    factor=0.5,
+    patience=5,
+    cooldown=0,
+    verbose=True)
 
-checkpoint_config = dict(interval=5000, save_optimizer=True, by_epoch=False)
-# remove gpu_collect=True in non distributed training
-evaluation = dict(interval=500, save_image=True, gpu_collect=True)
+checkpoint_config = dict(interval=1604, save_optimizer=True, by_epoch=False)
+evaluation = dict(interval=1604, save_image=False)
 log_config = dict(
-    interval=100, hooks=[
+    interval=100,
+    hooks=[
         dict(type='TextLoggerHook', by_epoch=False),
+        dict(
+            type='TensorboardLoggerHook',
+            log_dir=f'work_dirs/{exp_name}/tb_log/',
+            interval=100,
+            ignore_last=False,
+            reset_flag=False,
+            by_epoch=False),
     ])
 visual_config = None
 

diff --git a/configs/video_interpolators/cain/metafile.yml b/configs/video_interpolators/cain/metafile.yml
@@ -7,16 +7,16 @@ Collections:
   - https://aaai.org/ojs/index.php/AAAI/article/view/6693/6547
   README: configs/video_interpolators/cain/README.md
 Models:
-- Config: configs/video_interpolators/cain/cain_b5_320k_vimeo-triplet.py
+- Config: configs/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet.py
   In Collection: CAIN
   Metadata:
-    Training Data: Others
-  Name: cain_b5_320k_vimeo-triplet
+    Training Data: VIMEO90K
+  Name: cain_b5_g1b32_vimeo90k_triplet
   Results:
-  - Dataset: Others
+  - Dataset: VIMEO90K
     Metrics:
-      vimeo-90k-triple:
-        PSNR: 34.49
-        SSIM: 0.9565
+      vimeo-90k-triplet:
+        PSNR: 34.601
+        SSIM: 0.9578
     Task: Video_interpolators
-  Weights: https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_320k_vimeo-triple_20220117-647f3de2.pth
+  Weights: https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth
diff --git a/tests/test_inference.py b/tests/test_inference.py
@@ -72,7 +72,7 @@ def test_restoration_video_inference():
 
 def test_video_interpolation_inference():
     model = init_model(
-        './configs/video_interpolators/cain/cain_b5_320k_vimeo-triplet.py',
+        './configs/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet.py',
         None,
         device='cpu')
     model.cfg['demo_pipeline'] = [