Siqi Zhang*, Qizhe Zhang*, Shanghang Zhang*†, Xiaohong Liu*, Jingkun Yue*, Ming Lu, Huihuan Xu, Jiaxin Yao, Xiaobao Wei, Jiajun Cao, Xiang Zhang, Ming Gao, Jun Shen, Yichang Hao, Yinkui Wang, Xingcai Zhang, Song Wu, Ping Zhang, Shuguang Cui & Guangyu Wang† (*Equal Contribution, †Corresponding Author)
Nature Biomedical Engineering (2025): https://www.nature.com/articles/s41551-025-01497-3
Lead Contact: Prof. Guangyu Wang, Prof. Shanghang Zhang.
Vision foundation models have demonstrated vast potential in achieving generalist medical segmentation capability, providing a versatile, task-agnostic solution through a single model. However, current generalist models involve simple pre-training on various medical data containing irrelevant information, often resulting in the negative transfer phenomenon and degenerated performance. Furthermore, the practical applicability of foundation models across diverse open-world scenarios, especially in out-of-distribution (OOD) settings, has not been extensively evaluated. Here we construct a publicly accessible database, MedSegDB, based on a tree-structured hierarchy and annotated from 129 public medical segmentation repositories and 5 in-house datasets. We further propose a Generalist Medical Segmentation model (MedSegX), a vision foundation model trained with a model-agnostic Contextual Mixture of Adapter Experts (ConMoAE) for open-world segmentation. We conduct a comprehensive evaluation of MedSegX across a range of medical segmentation tasks. Experimental results indicate that MedSegX achieves state-of-the-art performance across various modalities and organ systems in in-distribution (ID) settings. In OOD and real-world clinical settings, MedSegX consistently maintains its performance in both zero-shot and data-efficient generalization, outperforming other foundation models.
🔥 [2025/09/05] Our paper is available online!
🔥 [2025/08/05] Our paper is accepted by Nature Biomedical Engineering!
- Clone this repository.
git clone https://github.com/MedSegX/MedSegX-code.git MedSegX
cd MedSegX- Create a conda environment.
conda create -n medsegx python=3.10 -y
conda activate medsegx- Install necessary packages.
conda install pytorch==2.0.0 torchvision==0.15.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt- Download base model checkpoints from SAM:
| Model | Backbone | Checkpoint |
|---|---|---|
| SAM | ViT-B | sam-vit-b-01ec64.pth |
| SAM | ViT-L | sam-vit-l-0b3195.pth |
| SAM | ViT-H | sam-vit-h-4b8939.pth |
- Download our MedSegX checkpoint from Google Drive.
The checkpoints should be placed in the ./playground directory, with the following structure:
playground/
├── SAM/
│ ├── sam_vit_b_01ec64.pth
│ ├── sam_vit_l_0b3195.pth
│ └── sam_vit_h_4b8939.pth
└── MedSegX/
└── medsegx_vit_b.pth
We provide an example dataset to help users get started with MedSegX quickly. This example dataset has already been organized according to the complete MedSegDB structure. You only need to unzip it to the ./playground directory. Then, you can directly go to the Quick Start section to perform a fast inference.
Our MedSegDB are curated from open-source medical segmentation datasets, which can be accessed via the weblinks provided in Supplementary Table 1. Among them, the datasets that permit redistribution are available on HuggingFace, where the data is fully preprocessed and can be used out of the box without any further processing.
The whole MedSegDB should be placed in the ./playground directory and organized in the following structure:
playground/
└── MedSegDB/
├── train/
└── eval/
├── ID/
├── OOD/
│ ├── cross_site/
│ └── cross_task/
└── RealWorld/
├── cross_site/
└── cross_task/
For more details about data structure of MedSegDB, please see DATA.md.
After preparing the model weights and example dataset, you can easily use the quick-start.sh to perform inference with MedSegX. For example, you can run the following command for conducting an ID evaluation:
python evaluate_internal.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--model_weight ./playground/MedSegX/medsegx_vit_b.pth \
--data_path ./playground/MedSegDB-example/eval/ID \
--metric dsc hd \
--device_ids 0 \
--batch_size 32After running the above script, the results will be saved in the ./playground/MedSegX directory. We also provide example results for double-checking.
After preparing MedSegDB according to DATA.md, you can use the following command to train MedSegX:
python pretrain.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--data_path ./playground/MedSegDB \
--device_ids 0 1 2 3 4 5 6 7 \
--num_epochs 30 \
--batch_size 1024 \
--lr 1e-3 \
--use_ampThis command trains MedSegX on 8 GPUs by default, with the batch size of 1024, requiring at least 40GB memory for each GPU. You can set the device_ids and batch_size arguments for different machine. For example, if you want to train MedSegX on 4 GPUs, you can use the following command:
python pretrain.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--data_path ./playground/MedSegDB \
--device_ids 0 1 2 3 \
--num_epochs 30 \
--batch_size 512 \
--lr 5e-4 \
--use_ampYou can further fine-tune our provided MedSegX pre-trained weight on OOD tasks to achieve better performance. For example, if you want to fine-tune on cross-site shift, you can use the following command:
python finetune.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--data_path ./playground/MedSegDB/eval/OOD \
--shift_type cross_site \
--device_ids 0 1 2 3 4 5 6 7 \
--num_epochs 30 \
--batch_size 64 \
--validation val \
--resume ./playground/MedSegX/medsegx_vit_b.pth \
--lr 5e-5 \
--use_ampSimilarly, if you want to fine-tune on cross-task shift using a single GPU, you can use the following command:
python finetune.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--data_path ./playground/MedSegDB/eval/OOD \
--shift_type cross_task \
--device_ids 0 \
--num_epochs 30 \
--batch_size 64 \
--validation val \
--resume ./playground/MedSegX/medsegx_vit_b.pth \
--lr 5e-5 \
--use_ampFor internal evalution, you can run the following command:
python evaluate_internal.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--model_weight ./playground/MedSegX/medsegx_vit_b.pth \
--data_path ./playground/MedSegDB/eval/ID \
--metric dsc hd \
--device_ids 0 1 2 3 4 5 6 7 \
--batch_size 32For external evalution with cross-site shift, you can run the following command:
python evaluate_external.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--model_weight ./playground/MedSegX/medsegx_vit_b.pth \
--data_path ./playground/MedSegDB/eval/OOD \
--shift_type cross_site \
--metric dsc hd \
--device_ids 0 1 2 3 4 5 6 7 \
--batch_size 32And for external evalution with cross-task shift, you can run the following command:
python evaluate_external.py \
--checkpoint ./playground/SAM \
--model_type vit_b \
--model_weight ./playground/MedSegX/medsegx_vit_b.pth \
--data_path ./playground/MedSegDB/eval/OOD \
--shift_type cross_task \
--metric dsc hd \
--device_ids 0 1 2 3 4 5 6 7 \
--batch_size 32If you find MedSegX useful for your research and applications, please cite using this BibTeX:
@article{zhang2025generalist,
title={A generalist foundation model and database for open-world medical image segmentation},
author={Zhang, Siqi and Zhang, Qizhe and Zhang, Shanghang and Liu, Xiaohong and Yue, Jingkun and Lu, Ming and Xu, Huihuan and Yao, Jiaxin and Wei, Xiaobao and Cao, Jiajun and others},
journal={Nature Biomedical Engineering},
pages={1--16},
year={2025},
publisher={Nature Publishing Group UK London}
}
Or the following plain-text version from Nature:
Zhang, S., Zhang, Q., Zhang, S. et al. A generalist foundation model and database for open-world medical image segmentation. Nat. Biomed. Eng (2025). https://doi.org/10.1038/s41551-025-01497-3
This project is released under the Apache 2.0 license.
We appreciate the open-source efforts of SAM and MedSAM teams.
