SRNet

This repository contains the source code for our paper:

SRNet: Self-supervised Structure Regularization for Stereo Matching

Citation

If you find our work useful in your research, please consider citing our paper:

@inproceedings{ChengJ2024SRnet,
  title={SRNet: Self-supervised Structure Regularization for Stereo Matching},
  author={Cheng, Jun and Gu, Zaiwang and Liu, Weide and Fan, Jiayuan and Li Zhengguo and Foo Chuan-Sheng},
  booktitle={Neurocomputing},
  year={2025}
}

📢 News

2025-10-23: This paper has been accpted by Neurocomputing.

Demos

Pretrained models can be downloaded from google drive

We assume the downloaded pretrained weights are located under the pretrained_models directory.

You can demo a trained model on pairs of images. To predict stereo for Middlebury, run

python demo_imgs.py \
--restore_ckpt pretrained_models/sceneflow/sceneflow.pth \
-l=path/to/your/left_imgs \
-r=path/to/your/right_imgs

or you can demo a trained model pairs of images for a video, run:

python demo_video.py \
--restore_ckpt pretrained_models/sceneflow/sceneflow.pth \
-l=path/to/your/left_imgs \
-r=path/to/your/right_imgs

Environment

NVIDIA RTX 3090
Python 3.8

Create a virtual environment and activate it.

conda create -n IGEV python=3.8
conda activate IGEV

Dependencies

bash env.sh

Alternatively, you can install a higher version of PyTorch that supports bfloat16 training.

bash env_bfloat16.sh

Required Data

To evaluate/train IGEV-Stereo, you will need to download the required datasets.

Scene Flow (Includes FlyingThings3D, Driving & Monkaa)
KITTI
Middlebury
ETH3D

By default core/stereo_datasets.py will search for the datasets in these locations.

├── /data
    ├── sceneflow
        ├── frames_finalpass
            ├── TRAIN
                ├── A
                ├── ...
                ├── 15mm_focallength
                ├── ...
                ├── funnyworld_augmented0_x2
                ├── ...
            ├── TEST
        ├── disparity
    ├── KITTI
        ├── KITTI_2012
            ├── training
            ├── testing
            ├── vkitti
        ├── KITTI_2015
            ├── training
            ├── testing
            ├── vkitti
    ├── Middlebury
        ├── trainingH
        ├── trainingH_GT
    ├── ETH3D
        ├── two_view_training
        ├── two_view_training_gt
    ├── DTU_data
        ├── dtu_train
        ├── dtu_test

You should replace the default path with your own.

DTU

Download pre-processed DTU's training set (provided by PatchmatchNet). The dataset is already organized as follows:

root_directory
├──Cameras_1
├──Rectified
└──Depths_raw

Download our processed camera parameters from here. Unzip all the camera folders into root_directory/Cameras_1.

Evaluation

To evaluate on Scene Flow or Middlebury or ETH3D, run

python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset sceneflow

or

python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset middlebury_H

or

python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset eth3d

Training

To train on Scene Flow, run

python train_stereo.py --logdir ./checkpoints/sceneflow

To train on KITTI, run

python train_stereo.py --logdir ./checkpoints/kitti --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --train_datasets kitti

Bfloat16 Training

NaN values during training: If you encounter NaN values in your training, this is likely due to overflow when using float16. This can happen when large gradients or high activation values exceed the range represented by float16. To fix this:

-Try switching to bfloat16 by using --precision_dtype bfloat16.

-Alternatively, you can use float32 precision by setting --precision_dtype float32.

Training with bfloat16

Before you start training, make sure you have hardware that supports bfloat16 and the right environment set up for mixed precision training.

bash env_bfloat16.sh

Then you can train the model with bfloat16 precision:

python train_stereo.py --mixed_precision --precision_dtype bfloat16

Submission

For submission to the KITTI benchmark, run

python save_disp.py

MVS training and evaluation

To train on DTU, run

python train_mvs.py

To evaluate on DTU, run

python evaluate_mvs.py

Acknowledgements

This project is based on IGEV, PSMNet, RTNet and GWCNet. We thank the original authors for their excellent works.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
IGEV-MVS		IGEV-MVS
IGEV-Stereo		IGEV-Stereo
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SRNet

Citation

📢 News

Demos

Environment

Create a virtual environment and activate it.

Dependencies

Required Data

DTU

Evaluation

Training

Bfloat16 Training

Training with bfloat16

Submission

MVS training and evaluation

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

Guzaiwang/SRNet

Folders and files

Latest commit

History

Repository files navigation

SRNet

Citation

📢 News

Demos

Environment

Create a virtual environment and activate it.

Dependencies

Required Data

DTU

Evaluation

Training

Bfloat16 Training

Training with bfloat16

Submission

MVS training and evaluation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages