PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks".
Please see INSTALL.md
Models pretrained on ImageNet can be accessed by:
model = torch.hub.load('cfzd/FcaNet', 'fca34' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca50' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca101' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca152' ,pretrained=True)
Due to the conversion between FP16 training and the provided FP32 models, the evaluation results are slightly different(max -0.06%/+0.05%) compared with the reported results.
Model | Reported | Evaluation Results | Link |
---|---|---|---|
Fca34 | 75.07 | 75.02 | GoogleDrive/BaiduDrive(code:m7v8) |
Fca50 | 78.52 | 78.57 | GoogleDrive/BaiduDrive(code:mgkk) |
Fca101 | 79.64 | 79.63 | GoogleDrive/BaiduDrive(code:8t0j) |
Fca152 | 80.08 | 80.02 | GoogleDrive/BaiduDrive(code:5yeq) |
To evaluate, run
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS main.py \
-e \
--b 128 \
--dali_cpu \
-a fcanet34 \ # also can be 50,101,152
--evaluate_model /path/to/your/downloaded/model \
/path/to/your/ImageNet
Or please see launch_eval.sh
Since the paper is uploaded to arxiv, many academic peers ask us: the proposed DCT basis can be viewed as a simple tensor, then how about learning the tensor directly? Why use DCT instead of learnable tensor? Learnable tensor can be better than DCT.
Our concrete answer is: the proposed DCT is better than the learnable way, although it is counter-intuitive.
Method | ImageNet Top-1 Acc | Link |
---|---|---|
Learnable tensor, random initialization | 77.914 | GoogleDrive/BaiduDrive(code:p2hl) |
Learnable tensor, DCT initialization | 78.352 | GoogleDrive/BaiduDrive(code:txje) |
Fixed tensor, random initialization | 77.742 | GoogleDrive/BaiduDrive(code:g5t9) |
Fixed tensor, DCT initialization (Ours) | 78.574 | GoogleDrive/BaiduDrive(code:mgkk) |
To verify this results, one can select the cooresponding types of tensor in the L73-L83 in model/layer.py
, uncomment it and train the whole network.
Please see launch_training.sh
- Object detection models
- Instance segmentation models
- Make the switching between configs more easier