SAM-RSP: A New Few-Shot Segmentation Method Based on Segment Anything Model and Rough Segmentation Prompts

Abstract: Few-shot segmentation (FSS) aims to segment novel classes with a few labeled images. The backbones used in existing methods are pre-trained through classification tasks on the ImageNet dataset. Although these backbones can effectively perceive the semantic categories of images, they cannot accurately perceive the regional boundaries within one image, which limits the model performance. Recently, Segment Anything Model (SAM) has achieved precise image segmentation based on point or box prompts, thanks to its excellent perception of region boundaries within one image. However, it cannot effectively provide semantic information of images. This paper proposes a new few-shot segmentation method that can effectively perceive both semantic categories and regional boundaries. This method first utilizes the SAM encoder to perceive regions and obtain the query embedding. Then the support and query images are input into a backbone pre-trained on ImageNet to perceive semantics and generate a rough segmentation prompt (RSP). This query embedding is combined with the prompt to generate a pixel-level query prototype, which can better match the query embedding. Finally, the query embedding, prompt, and prototype are combined and input into the designed multi-layer prompt transformer decoder, which is more efficient and lightweight, and can provide a more accurate segmentation result. In addition, other methods can be easily combined with our framework to improve their performance. Plenty of experiments on PASCAL-5ⁱ and COCO-20ⁱ under 1-shot and 5-shot settings prove the effectiveness of our method. Our method also achieves new state-of-the-art.

Dependencies

RTX 3090
Python 3.8
PyTorch 1.12.0
cuda 11.6
torchvision 0.13.0
tensorboardX 2.2

Datasets

PASCAL-5ⁱ: VOC2012 + SBD
COCO-20ⁱ: COCO2014
Put the datasets into the data/ directory.
Run util/get_mulway_base_data.py to generate base annotations and put them into the data/base_annotation/ directory. (Only used when the coarse segmentation prompt generator is BAM.)

Models

Download the pre-trained VGG16 and ResNet50 encoders and put them into the initmodel/ directory.
Download the pre-trained SAM encoder and put it into the initmodel/SAM_encoder/ directory.
Download the pre-trained BAM models and put them into the initmodel/BAM_models/ directory. (Only used when the coarse segmentation prompt generator is BAM.)
Download the pre-trained base learners from BAM and put them under initmodel/PSPNet/. (Only used when the coarse segmentation prompt generator is BAM.)

Scripts

Change configuration and add weight path to .yaml files in config directory, then run the train.sh file for training or test.sh file for testing.

Performance

Performance comparison with the state-of-the-art approachs in terms of average mIoU across all folds.

PASCAL-5ⁱ

Backbone Method 1-shot 5-shot

VGG16 MIANet 67.10 71.99

SAM-RSP(ours) 69.29 _(+2.19) 73.86 _(+1.87)

ResNet50 HDMNet 69.40 71.80

SAM-RSP(ours) 70.76 _(+1.36) 74.15 _(+2.35)
COCO-20ⁱ

Backbone Method 1-shot 5-shot

VGG16 HDMNet 45.90 52.40

SAM-RSP(ours) 48.79 _(+2.89) 54.15 _(+1.75)

ResNet50 MIANet 47.66 51.65

SAM-RSP(ours) 49.84 _(+2.18) 55.38 _(+3.73)

References

This repo is mainly built based on SAM, and BAM. Thanks for their great work!

This paper has been accepted by the Image and Vision Computing journal.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
config		config
figure		figure
lists		lists
model		model
util		util
README.md		README.md
test.py		test.py
test.sh		test.sh
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAM-RSP: A New Few-Shot Segmentation Method Based on Segment Anything Model and Rough Segmentation Prompts

Dependencies

Datasets

Models

Scripts

Performance

PASCAL-5ⁱ

COCO-20ⁱ

References

About

Releases

Packages

Languages

Backbone	Method	1-shot	5-shot
VGG16	MIANet	67.10	71.99
	SAM-RSP(ours)	69.29 _(+2.19)	73.86 _(+1.87)
ResNet50	HDMNet	69.40	71.80
	SAM-RSP(ours)	70.76 _(+1.36)	74.15 _(+2.35)

Backbone	Method	1-shot	5-shot
VGG16	HDMNet	45.90	52.40
	SAM-RSP(ours)	48.79 _(+2.89)	54.15 _(+1.75)
ResNet50	MIANet	47.66	51.65
	SAM-RSP(ours)	49.84 _(+2.18)	55.38 _(+3.73)

Jiaguang-NEU/SAM-RSP

Folders and files

Latest commit

History

Repository files navigation

SAM-RSP: A New Few-Shot Segmentation Method Based on Segment Anything Model and Rough Segmentation Prompts

Dependencies

Datasets

Models

Scripts

Performance

PASCAL-5i

COCO-20i

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

PASCAL-5ⁱ

COCO-20ⁱ

Packages