The code of segmentation #251

ydhongHIT · 2024-07-03T14:13:13Z

Hi, do you have a plan to release the segmentation code? I reproduced the Mambaout-Tiny on ImageNet-1K, which achieves 82.6 top1 accuracy. I think the result is reasonable given the randomness and the different batch size (1024 rather than 4096). However, when I transfer the pre-trained model to semantic segmentation, it only achieves 46.1 mIoU on ade20k, much lower than the 47.4 reported in the paper. I use the Swin config based on MMsegmentation codebase. The droppath rate is 0.2 and no layer-wise decaying learning rate.

yuweihao · 2024-07-04T04:13:33Z

Hi @ydhongHIT , many thanks for your attention. Currently, I am busy and will arrange the segmentation code as soon as possible. MambaOut is also based on Swin config. Note to add LN for outputs of backbone at each stage. For all model sizes, the learning rate is 1e-4. The drop path is 0.3 for Tiny, 0.3 or 0.4 for Small and 0.6 for Base.

ydhongHIT · 2024-07-04T09:07:45Z

Thank you for the reply. I will train the model again according to your instructions. By the way, do you employ the layer-wise decaying learning rate like ConvNext?

yuweihao · 2024-07-04T09:21:37Z

Hi @yuweihao , Layer-wise decaying learning rate was not used for MambaOut.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The code of segmentation #251

The code of segmentation #251

ydhongHIT commented Jul 3, 2024 •

edited

Loading

yuweihao commented Jul 4, 2024

ydhongHIT commented Jul 4, 2024

yuweihao commented Jul 4, 2024

The code of segmentation #251

The code of segmentation #251

Comments

ydhongHIT commented Jul 3, 2024 • edited Loading

yuweihao commented Jul 4, 2024

ydhongHIT commented Jul 4, 2024

yuweihao commented Jul 4, 2024

ydhongHIT commented Jul 3, 2024 •

edited

Loading