Visual Place Recognition (VPR) is a fundamental task in robotics and computer vision, enabling systems to identify locations seen in the past using visual information. Previous state-of-the-art approaches focus on encoding and retrieving semantically meaningful supersegment representations of images to significantly enhance recognition recall rates. However, we find that they struggle to cope with significant variations in viewpoint and scale, as well as scenes with sparse or limited information. Furthermore, these semantic-driven supersegment representations often exclude semantically meaningless yet valuable pixel information. In this paper, we propose dilated superpixels to aggregate local descriptors, named Sel-V. This visually compact and complete representation substantially improves the robustness of segment-based methods and enhances the recognition of images with large variations. To further improve robustness, we introduce a multi-scale superpixel adaptive method - MuSSel-V, designed to accommodate a wide range of tasks across different domains. Extensive experiments conducted on benchmark datasets demonstrate that our method significantly outperforms existing approaches in recall, in diverse and complex environments characterised by dynamic changes or minimal scene information. Moreover, compared to existing supersegment representations, our approach achieves a notable advantage in processing speed.
Note: Currently, we only provide the standard Sel-V and MuSSel-V with pre-trained DINOv2 for feature extraction and SEEDS or SLIC for segmentation. The complete code will be released in the future.
For quick testing, we recommend downloading 17Places, VPAir, Laurel, and Hawkins from AnyLoc.
After downloading, place the datasets into the workspace. If you encounter path errors, please refer to the structure below and config.py.
wokspace/
├── 17places/
│ ├── query/
│ ├── ref/
│ ├── ...
├── laurel/
│ ├── db_images/
│ ├── q_images/
│ ├── ...
├── your_custom_data/
│ ├── query/
│ ├── ref/
│ ├── ...
├── features/
├── segments/
├── cache/
├── pca/
├── results/We implement our experiments using Python 3.10 and PyTorch 2.4.1+cu121.
To set up the environment, run the following commands:
conda env create -f mussel.yaml
conda activate mussel
conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidiaTo perform feature extraction using laurel as an example, run:
python feature_extraction.py laurel --dino_extractOr for custom data:
python feature_extraction.py <dataset> [--dino_extract]Parameters:
<dataset>: Specify the name of the dataset (e.g.,laureloryour_custom_data).[--dino_extract]: Use pre-trained DINOv2 for feature extraction.
Note: Codes for extraction using other backbones like CLIP and fine-tuned DINO will be released soon.
Choose between --seeds_extract or --slic_extract for segmentation:
python image_segmentation.py laurel --seeds_extractOr for custom data:
python image_segmentation.py <dataset> [--seeds_extract | --slic_extract]<dataset>: Specify the dataset name.[--seeds_extract]: Use the SEEDS algorithm for segmentation.[--slic_extract]: Use the SLIC algorithm for segmentation.
Run clustering on the extracted features:
python cluster_centre.py laurelOr for custom data:
python cluster_centre.py <dataset> Options:
- Use
Sp64_ao3_pca,Sp128_ao3_pca, orSp256_ao3_pcafor Sel-V with scales of 64, 128, or 256. - Use
SpMixed_ao3_pcafor MuSSel-V. - More dilation functions will be released soon.
Run PCA reduction:
python pca.py laurel Sp128_ao3_pca seeds tri 3Or for custom data:
python pca.py <dataset> <experiment> <segmentation_method> <dilation_methods> <hop/order>Parameters:
<dataset>: Dataset name.<experiment>: Experiment setting (e.g.,Sp128_ao3_pca).<segmentation_method>:seedsorslic(Support forsamandfastsamcoming soon).<dilation_methods>: Dilation function (trifor Delaunay Triangulation, more coming soon).<hop/order>: Neighborhood matrix order.
Run the following command to build VLAD descriptors and perform evaluation:
python vpr.py laurel Sp128_ao3_pca seeds tri 3 test_nameOr for custom data:
python pca.py <dataset> <experiment> <segmentation_method> <dilation_methods> <hop/order> <save_name>Parameters:
<save_name>: Name for the result file.
After running the command, you can:
- View the results directly in the terminal, or
- Find them saved at:
./workspace/results/<save_name>.txt
Issue: ModuleNotFoundError: No module named 'cv2.ximgproc' when running image_segmentation.py
If you encounter the following error:
ModuleNotFoundError: No module named 'cv2.ximgproc'Solution:
- Uninstall existing OpenCV packages:
pip uninstall opencv-python
pip uninstall opencv-contrib-python- Reinstall opencv-contrib-python (this package includes
cv2.ximgprocand other extra modules):
pip install opencv-contrib-pythonAfter reinstalling, try running image_segmentation.py again. If the error persists, make sure to check your Python environment and that the correct package version is installed.
We borrow some of the code from AnyLoc and SegVLAD. We thank the authors of AnyLoc and SegVLAD for making their code publicly available.
Note: The code is still under development. More alternative methods will be released soon. Stay tuned!