Skip to content

stanley-313/ImageSegFM-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Maintenance PR's Welcome

Image Segmentation in Foundation Model Era: A Survey

Tianfei Zhou , Wang Xia , Fei Zhang , Boyu Chang , Wenguan Wang , Ye Yuan , Ender Konukoglu , Daniel Cremers

arXiv PDF Project Page


This repository complies a collection of resources on image segmentation in foundation model era, and will be continuously updated to track developments in the field. Please feel free to submit a pull request if you find any work missing.

1. Introduction

Image segmentation is a long-standing challenge in computer vision, studied continuously over several decades, as evidenced by seminal algorithms such as N-Cut, FCN, and MaskFormer. With the advent of foundation models (FMs), contemporary segmentation methodologies have embarked on a new epoch by either adapting FMs (e.g., CLIP, Stable Diffusion, DINO) for image segmentation or developing dedicated segmentation foundation models (e.g., SAM, SAM2). These approaches not only deliver superior segmentation performance, but also herald newfound segmentation capabilities previously unseen in deep learning context. However, current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions associated with these advancements. This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation. We investigate two basic lines of research (as shown in the following figure) – generic image segmentation (i.e., semantic segmentation, instance segmentation, panoptic segmentation), and promptable image segmentation (i.e., interactive segmentation, referring segmentation, few-shot segmentation) – by delineating their respective task settings, background concepts, and key challenges. Furthermore, we provide insights into the emergence of segmentation knowledge from FMs like CLIP, Stable Diffusion, and DINO. An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts. Subsequently, we engage in a discussion of open issues and potential avenues for future research.


2. Segmentation Knowledge Emerges From FMs

Given the emergency capabilities of LLMs, a natural question arises: Do segmentation properties emerge from FMs? The answer is positive, even for FMs not explicitly designed for segmentation, such as CLIP, DINO and Diffusion Models. This also unlocks a new frontier in image segmentation, i.e., acquiring segmentation without any training. The following figure illustrates how to approach this and shows some examples:


3. Foundation Model based GIS


4. Foundation Model based PIS

Citation

If you find our survey and repository useful for your research, please consider citing our paper:

@article{zhou2024SegFMSurvey
    title={Image Segmentation in Foundation Model Era: A Survey},
    author={Zhou, Tianfei and Xia, Wang and Zhang, Fei and Chang, Boyu and Wang, Wenguan and Yuan, Ye and Konukoglu, Ender and Cremers, Daniel},
    journal={arXiv preprint arXiv:2408.12957},
    year={2024},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •