This is the code repository for DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch.
Requirements
License
Model Weights
Training and Evaluation
- Pytorch (0.4.1+)
- python (2.7)
- scikit-image
- tensorboardX
- torchvision (0.2.0+)
- The souce code for DeepPruner and Differentiable PatchMatch are released under the © Uber, 2018-2019. Licensed under the Uber Non-Commercial License.
- The trained model-weights for DeepPruner are released under the license Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
DeepPruner was first trained on Sceneflow dataset and then finetuned on KITTI (Combined 394 images of KITTI-2012 and KITTI-2015) dataset.
- DeepPruner-fast (KITTI)
- DeepPruner-best (KITTI)
- DeepPruner-fast (Sceneflow)
- DeepPruner-best (Sceneflow)
NOTE: We allow the users to modify a bunch of model parameters and training setting for their own purpose. You may need to retain the model with the modified parameters. Check 'models/config.py' for more details.
KITTI 2015 has 200 stereo-pairs with ground truth disparities. We used 160 out of these 200 for training and the remaining 40 for validation. The training set was further augmented by 194 stereo image pairs from KITTI 2012 dataset.
- Download the KITTI 2012 and KITTI 2015 datasets.
- Split KITTI Stereo 2015 training dataset into "training" (160 pairs) and "validation" (40 pairs), following the same directory structure as of the original dataset. Make sure to have the following structure:
training_directory_stereo_2015
|----- image_2
|----- image_3
|----- disp_occ_0
val_directory_stereo_2015
|----- image_2
|----- image_3
|----- disp_occ_0
train_directory_stereo_2012
|----- colored_0
|----- colored_1
|----- disp_occ
test_directory
|----- image_2
|----- image_3
- Note that, any other dataset could be used for training, validation and testing. The directory structure should be same.
-
Like previous works, we fine-tuned the pre-trained Sceneflow model on KITTI dataset.
-
Training Command:
python finetune_kitti.py \
--loadmodel <path_to_sceneflow_model> \
--savemodel <path_to_save_directory_for_trained_models> \
--train_datapath_2015 <training_directory_stereo_2015> \
--val_datapath_2015 <val_directory_stereo_2015> \
--datapath_2012 <directory_stereo_2012> -
Training command arguments:
- --loadmodel (default: None): If not set, the model would train from scratch.
- --savemodel (default: './'): If not set, the script will save the trained models after every epoch in the same directory.
- --train_datapath_2015 (default: None): If not set, KITTI stereo 2015 dataset won't be used for training.
- --val_datapath_2015 (default: None): If not set, the script would fail. The validation dataset should have atleast one image to run.
- --datapath_2012 (default: None): If not set, KITTI stereo 2012 dataset won't be used for training.4. Training and validation tensorboard runs will be saved in './runs/' directory.
- We used KITTI 2015 Stereo testing set for evaluation. (Note any other dataset could be used.)
- To evaluate DeepPruner on any dataset, just create a base directory like:
test_directory
|----- image_2
|----- image_3
image_2 folder holds the left images, while image_3 folder holds the right images.
-
For evaluation, update the "mode" parameter in "models.config.py" to "evaluation".
-
Evaluation command:
python submission_kitti.py
--loadmodel <path_to_trained_model>
--save_dir <director_to_store_disparity_output>
--datapath <test_directory>
- The metrics used for evaluation are same as provided by the KITTI Stereo benchmark.
- The quantitaive results obtained by DeepPruner are as follows: (Note the standings in the Tables below are at the time of March 2019.)
-
Alongside learning the disparity (or depth maps), DeepPruner is able to predict the uncertain regions (occluded regions , bushy regions, object edges) efficiently . Since the uncertainty in prediction correlates well with the error in the disparity maps (Figure 7.), such uncertainty can be used in other downstream tasks.
-
Qualitative results are as follows:
-
Download Sceneflow dataset, which consists of FlyingThings3D, Driving and Monkaa (RGB images (cleanpass) and disparity).
-
We followed the same directory structure as of the downloaded data. Check dataloader/sceneflow_collector.py.
python train_sceneflow.py \
--loadmodel <path_to_trained_model> \
--save_dir <directory_to_store_disparity_output> \
--savemodel <directory_to_store_trained_models_every_epoch> \
--datapath_monkaa <monkaa_dataset> \
--datapath_flying <flying_things> \
--datapath_driving <driving_dataset>
- We used EPE(end point error) as one of the metrics.
-
The goal of Robust Vision Challenge challenge is to foster the development of vision systems that are robust and consequently perform well on a variety of datasets with different characteristics. Please refer to Robust Vision Challenge for more details.
-
We used the pre-trained Seneflow model and then jointly fine-tuned the model on KITTI, ETH3D and Middlebury datasets.
- Dataloader and setup details coming soon.
Check DeepPruner_ROB on KITTI benchmark.
Check DeepPruner_ROB on ETH3D benchmark.
Check DeepPruner_ROB on MiddleburyV3 benchmark. \