Nankai University
Overview: To unleash the potential of ConvNet in super-resolution, we propose a multi-scale attention network (MAN), by coupling a classical multi-scale mechanism with emerging large kernel attention. In particular, we proposed multi-scale large kernel attention (MLKA) and gated spatial attention unit (GSAU). Experimental results illustrate that our MAN can perform on par with SwinIR and achieve varied trade-offs between state-of-the-art performance and computations.
This repository contains PyTorch implementation for MAN (CVPRW 2024).
Table of contents
Testing: Set5, Set14, BSD100, Urban100, Manga109 (Google Drive/Baidu Netdisk).
Preparing: Please refer to the Dataset Preparation of BasicSR.
Network architecture: Group number (n_resgroups): 1 for simplicity, MAB number (n_resblocks): 5/24/36, channel width (n_feats): 48/60/180 for tiny/light/base MAN.
Overview of the proposed MAN constituted of three components: the shallow feature extraction module (SF), the deep feature extraction module (DF) based on multiple multi-scale attention blocks (MAB), and the high-quality image reconstruction module.
Component details: Three multi-scale decomposition modes are utilized in MLKA. The 7×7 depth-wise convolution is used in the GSAU.
Details of Multi-scale Large Kernel Attention (MLKA), Gated Spatial Attention Unit (GSAU), and Large Kernel Attention Tail (LKAT).The BasicSR framework is utilized to train our MAN, also testing.
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 train.py -opt options/trian_MAN.yml --launcher pytorch
python test.py -opt options/test_MAN.yml
The training/testing results will be saved in the ./experiments
and ./results
folders, respectively.
Pretrained models available at Google Drive and Baidu Netdisk (pwd: mans for all links).
HR (x4) | MAN-tiny | EDSR-base+ | MAN-light | EDSR+ | MAN |
---|---|---|---|---|---|
Params/FLOPs | 150K/8G | 1518K/114G | 840K/47G | 43090K/2895G | 8712K/495G |
Results of our MAN-tiny/light/base models. Set5 validation set is used below to show the general performance. The visual results of five testsets are provided in the last column.
Methods | Params | FLOPs | PSNR/SSIM (x2) | PSNR/SSIM (x3) | PSNR/SSIM (x4) | Results |
---|---|---|---|---|---|---|
MAN-tiny | 150K | 8.4G | 37.91/0.9603 | 34.23/0.9258 | 32.07/0.8930 | x2/x3/x4 |
MAN-light | 840K | 47.1G | 38.18/0.9612 | 34.65/0.9292 | 32.50/0.8988 | x2/x3/x4 |
MAN+ | 8712K | 495G | 38.44/0.9623 | 34.97/0.9315 | 32.87/0.9030 | x2/x3/x4 |
We would thank VAN and BasicSR for their enlightening work!
@inproceedings{wang2024multi,
title={Multi-scale Attention Network for Single Image Super-Resolution},
author={Wang, Yan and Li, Yusen and Wang, Gang and Liu, Xiaoguang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
year={2024}
}
or
@article{wang2022multi,
title={Multi-scale Attention Network for Single Image Super-Resolution},
author={Wang, Yan and Li, Yusen and Wang, Gang and Liu, Xiaoguang},
journal={arXiv preprint arXiv:2209.14145},
year={2022}
}