Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The code of segmentation #251

Open
ydhongHIT opened this issue Jul 3, 2024 · 3 comments
Open

The code of segmentation #251

ydhongHIT opened this issue Jul 3, 2024 · 3 comments

Comments

@ydhongHIT
Copy link

ydhongHIT commented Jul 3, 2024

Hi, do you have a plan to release the segmentation code? I reproduced the Mambaout-Tiny on ImageNet-1K, which achieves 82.6 top1 accuracy. I think the result is reasonable given the randomness and the different batch size (1024 rather than 4096). However, when I transfer the pre-trained model to semantic segmentation, it only achieves 46.1 mIoU on ade20k, much lower than the 47.4 reported in the paper. I use the Swin config based on MMsegmentation codebase. The droppath rate is 0.2 and no layer-wise decaying learning rate.

@yuweihao
Copy link
Owner

yuweihao commented Jul 4, 2024

Hi @ydhongHIT , many thanks for your attention. Currently, I am busy and will arrange the segmentation code as soon as possible. MambaOut is also based on Swin config. Note to add LN for outputs of backbone at each stage. For all model sizes, the learning rate is 1e-4. The drop path is 0.3 for Tiny, 0.3 or 0.4 for Small and 0.6 for Base.

@ydhongHIT
Copy link
Author

Thank you for the reply. I will train the model again according to your instructions. By the way, do you employ the layer-wise decaying learning rate like ConvNext?

@yuweihao
Copy link
Owner

yuweihao commented Jul 4, 2024

Hi @yuweihao , Layer-wise decaying learning rate was not used for MambaOut.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants