by Bo Dong, Wenhai Wang, Jinpeng Li, Deng-Ping Fan.
This repo is the official implementation of "Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers".
Polyp-PVT is initially described in arxiv.
Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features; and 2) designing effective mechanism for fusing these features. Different from existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three novel modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), and a similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features, while the CIM is applied to capture polyp information disguised in low-level features. With the help of the SAM, we extend the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT , effectively suppresses noises in the features and significantly improves their expressive capabilities.
Polyp-PVT achieves strong performance on image-level polyp segmentation (0.808 mean Dice
and 0.727 mean IoU
on ColonDB) and
video polyp segmentation (0.880 mean dice
and 0.802 mean IoU
on CVC-300-TV), surpassing previous models by a large margin.
We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:qw9i], including our results and that of compared models.
We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:rtvt], including our results and that of compared models.
Python 3.8
Pytorch 1.7.1
torchvision 0.8.2
Downloading training and testing datasets and move them into ./dataset/, which can be found in this Google Drive/Baidu Drive [code:sydz].
You should download the pretrained model from Google Drive/Baidu Drive [code:w4vk], and then put it in the './pretrained_pth' folder for initialization.
Clone the repository:
git clone https://github.com/DengPingFan/Polyp-PVT.git
cd Polyp-PVT
bash train.sh
cd Polyp-PVT
bash test.sh
Matlab: Please refer to the work of MICCAI2020 (link).
Python: Please refer to the work of ACMMM2021 (link).
Please note that we use the Matlab version to evaluate in our paper.
You could download the trained model from Google Drive/Baidu Drive [code:9rpy] and put the model in directory './model_pth'.
Google Drive/Baidu Drive [code:x3jc]
@aticle{dong2023PolypPVT,
title={Polyp-PVT: Polyp Segmentation with PyramidVision Transformers},
author={Bo, Dong and Wenhai, Wang and Deng-Ping, Fan and Jinpeng, Li and Huazhu, Fu and Ling, Shao},
journal={CAAI AIR},
year={2023}
}
We are very grateful for these excellent works PraNet, EAGRNet and MSEG, which have provided the basis for our framework.
If you want to improve the usability or any piece of advice, please feel free to contact me directly (bodong.cv@gmail.com).
The source code is free for research and education use only. Any comercial use should get formal permission first.