AI-SAM: Automatic and Interactive Segment Anything Model
Yimu Pan, Sitao Zhang, Alison D. Gernand, Jeffery A. Goldstein, James Z. Wang
The Automatic and Interactive Segment Anything Model (AI-SAM) is designed to streamline the process of generating segmentation masks for various classes automatically while support interactive user input. During its training phase, AI-SAM acquires the capability to produce both the point prompts and segmentation masks for each class, using only the segmentation masks themselves as learning targets.
In the inference stage, AI-SAM automatically generates a set of point prompts, along with the segmentation masks for each class. This feature facilitates a user-friendly interaction; users can directly modify the point prompts to adjust the segmentation masks as needed. Below is an overview of the entire AI-SAM pipeline:
The detail analysis is in the paper. We present the main result table below:
The code requires python>=3.8
, pytorch>=1.7
, and torchvision>=0.8
.
You will also need the following packages.
scipy
scikit-learn
scikit-image
opencv-python
matplotlib
ipywidgets
notebook
Prepare the dataset following MT-UNet. Then, download the pretrained weight. Finally, you may run the following code to obtain the scores in the paper:
python eval_one_gpu.py --dataset acdc --use_amp -checkpoint [path-to-the-downloaded-weight] -model_type vit_h --tr_path [path-to-the-dataset-dir] --use_classification_head --use_lora --use_hard_point
Prepare the dataset following TransUNet. Then, download the pretrained weight. Finally, you may run the following code to obtain the scores in the paper:
python eval_one_gpu.py --dataset synapse --use_amp -checkpoint [path-to-the-downloaded-weight] -model_type vit_h --tr_path [path-to-the-dataset-dir] --use_classification_head --use_lora --use_hard_point
Refer to this notebook for detail. AI-SAM will first generate a set of foreground and background points base on the class of choice and the user can modify the points base on the segmentation result.
- Add code for natural images.
This work is licensed under Apache 2.0 license.
If you find this work useful, please cite:
@article{pan2023ai,
title={AI-SAM: Automatic and Interactive Segment Anything Model},
author={Pan, Yimu and Zhang, Sitao and Gernand, Alison D and Goldstein, Jeffery A and Wang, James Z},
journal={arXiv preprint arXiv:2312.03119},
year={2023}
}
The code is modified from MedSAM and SAM. We also used the LoRA implementation from SAMed.