This repository provides the official implementation of: Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining
- The first attempt to discover the impact of ImageNet pretrained Mamba-based networks in medical image segmentation.
- Provides two Mamba-based networks for medical image segmentation with different computation requirements.
- Swin-UMamba can outperform previous segmentation models including CNNs, ViTs, and the latest Mamba-based models with notable margin.
Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks are constrained by their local receptive fields, and vision transformers suffer from high quadratic complexity of their attention mechanism. Recently, Mamba-based models have gained great attention for their impressive ability in long sequence modeling. Several studies have demonstrated that these models can outperform popular vision models in various tasks, offering higher accuracy, lower memory consumption, and less computational burden. However, existing Mamba-based models are mostly trained from scratch and do not explore the power of pretraining, which has been proven to be quite effective for data-efficient medical image analysis. This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks, leveraging the advantages of ImageNet-based pretraining. Our experimental results reveal the vital role of ImageNet-based training in enhancing the performance of Mamba-based models. Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models. Notably, on AbdomenMRI, Endoscopy, and Microscopy datasets, Swin-UMamba outperforms its closest counterpart U-Mamba by an average score of 2.72%.
Main Results
- AbdomenMRI
- Endoscopy
- Microscopy
All three datasets can be downloaded from U-Mamba.
Main Requirements
torch==2.0.1
torchvision==0.15.2
causal-conv1d==1.1.1
mamba-ssm
torchinfo
timm
numba
Installation
# create a new conda env
conda create -n swin_umamba python=3.10
conda activate swin_umamba
# install requirements
pip install torch==2.0.1 torchvision==0.15.2
pip install causal-conv1d==1.1.1
pip install mamba-ssm
pip install torchinfo timm numba
# install swin_umamba
git clone https://github.com/JiarunLiu/Swin-UMamba
cd Swin-UMamba/swin_umamba
pip install -e .
Download Model
We use the ImageNet pretrained VMamba-Tiny model from VMamba. You need to download the model checkpoint and put it into data/pretrained/vmamba/vmamba_tiny_e292.pth
wget https://github.com/MzeroMiko/VMamba/releases/download/%2320240218/vssmtiny_dp01_ckpt_epoch_292.pth
mv vssmtiny_dp01_ckpt_epoch_292.pth data/pretrained/vmamba/vmamba_tiny_e292.pth
Preprocess
We use the same data & processing strategy following U-Mamba. Download dataset from U-Mamba and put them into the data folder. Then preprocess the dataset with following command:
nnUNetv2_plan_and_preprocess -d DATASET_ID --verify_dataset_integrity
Training & Testing
Using the following command to train & test Swin-UMamba
# AbdomenMR dataset
bash scripts/train_AbdomenMR.sh MODEL_NAME
# Endoscopy dataset
bash scripts/train_Endoscopy.sh MODEL_NAME
# Microscopy dataset
bash scripts/train_Microscopy.sh MODEL_NAME
Here MODEL_NAME
can be:
nnUNetTrainerSwinUMamba
: Swin-UMamba model with ImageNet pretrainingnnUNetTrainerSwinUMambaD
: Swin-UMamba$\dagger$ model with ImageNet pretrainingnnUNetTrainerSwinUMambaScratch
: Swin-UMamba model without ImageNet pretrainingnnUNetTrainerSwinUMambaDScratch
: Swin-UMamba$\dagger$ model without ImageNet pretraining
You can download our model checkpoints here.
For further questions, please feel free to contact Jiarun Liu
This project is under the Apache License 2.0 license. See LICENSE for details.
Our code is based on nnU-Net, Mamba, UMamba, VMamba, and Swin-Unet. We thank the authors for making their valuable code & data publicly available.
If you find this repository useful, please consider citing this paper:
@article{Swin-UMamba,
title={Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining},
author={Jiarun Liu and Hao Yang and Hong-Yu Zhou and Yan Xi and Lequan Yu and Yizhou Yu and Yong Liang and Guangming Shi and Shaoting Zhang and Hairong Zheng and Shanshan Wang},
journal={arXiv preprint arXiv:2402.03302},
year={2024}
}