MVSTER: Epipolar Transformer for Efficient Multi-View Stereo
This repository contains the official implementation of the paper: "MVSTER: Epipolar Transformer for Efficient Multi-View Stereo".
MVSTER is a learning-based MVS method which achieves competitive reconstruction performance with significantly higher efficiency. MVSTER leverages the proposed epipolar Transformer to learn both 2D semantics and 3D spatial associations efficiently. Specifically, the epipolar Transformer utilizes a detachable monocular depth estimator to enhance 2D semantics and uses cross-attention to construct data-dependent 3D associations along epipolar line. Additionally, MVSTER is built in a cascade structure, where entropy-regularized optimal transport is leveraged to propagate finer depth estimations in each stage.
MVSTER is tested on:
- python 3.7
- CUDA 11.1
pip install -r requirements.txt
- Dowload DTU dataset. For convenience, can download the preprocessed DTU training data and Depths_raw (both from Original MVSNet), and upzip it as the $DTU_TRAINING folder. For training and testing with raw image size, you can download Rectified_raw, and unzip it.
├── Cameras
├── Depths
├── Depths_raw
├── Rectified
├── Rectified_raw (Optional)
In scripts/train_dtu.sh
, set DTU_TRAINING
as $DTU_TRAINING
Train MVSTER (Multi-GPU training):
- Train with middle size (512x640):
bash ./scripts/train_dtu.sh mid exp_name
- Train with raw size (1024x1280):
bash ./scripts/train_dtu.sh raw exp_name
After training, you will get model checkpoints in ./checkpoints/dtu/exp_name.
- Download the preprocessed test data DTU testing data (from Original MVSNet) and unzip it as the $DTU_TESTPATH folder, which should contain one
cams
folder, oneimages
folder and onepair.txt
file. - In
scripts/test_dtu.sh
, setDTU_TESTPATH
as $DTU_TESTPATH. - The
DTU_CKPT_FILE
is automatically set as your pretrained checkpoint file, you also can download my pretrained model. - Test with middle size:
bash ./scripts/test_dtu.sh mid exp_name
- Test with raw size:
bash ./scripts/test_dtu.sh raw exp_name
- Test with provided pretrained model:
bash scripts/test_dtu.sh mid benchmark --loadckpt PATH_TO_CKPT_FILE
After testing, you will get reconstructed point clouds of DTU test set in ./outputs/dtu/exp_name.
- For quantitative evaluation, download SampleSet and Points from DTU's website. Unzip them and place
Points
folder inSampleSet/MVS Data/
. The structure looks like:
SampleSet
├──MVS Data
└──Points
-
For convinience evaluation, please install matlab (tested on Ubuntu 18.04) and uncomment mrun_rst function at the end of ./test_mvs4.py, and you also need to change the path of matlab excutable file (for me, it is /mnt/cfs/algorithm/xiaofeng.wang/jeff/code/MVS/misc/matlab/bin/matlab). Then you can evaluate point cloud reconstruction results when testing is finished.
-
You can also evaluate the metrics with the traditional steps: In
evaluations/dtu/BaseEvalMain_web.m
, setdataPath
as the path toSampleSet/MVS Data/
,plyPath
as directory that stores the reconstructed point clouds andresultsPath
as directory to store the evaluation results. Then runevaluations/dtu/BaseEvalMain_web.m
in matlab.
Acc. | Comp. | Overall. | Inf. Time | |
---|---|---|---|---|
MVSTER (mid size) | 0.350 | 0.276 | 0.313 | 0.09s |
MVSTER (raw size) | 0.325 | 0.385 | 0.355 | 0.17s |
Point cloud results on DTU, Tanks and Temples, ETH3D
If you find this project useful for your research, please cite:
@misc{wang2022mvster,
title={MVSTER: Epipolar Transformer for Efficient Multi-View Stereo},
author={Xiaofeng Wang, Zheng Zhu, Fangbo Qin, Yun Ye, Guan Huang, Xu Chi, Yijia He and Xingang Wang},
journal={arXiv preprint arXiv:TODO},
year={2022}
}
Our work is partially baed on these opening source work: MVSNet, MVSNet-pytorch, cascade-stereo, PatchmatchNet.
We appreciate their contributions to the MVS community.