Skip to content

Commit

Permalink
release evaluation code
Browse files Browse the repository at this point in the history
  • Loading branch information
Junyi42 committed Oct 20, 2024
1 parent 649195a commit 6117b33
Show file tree
Hide file tree
Showing 14 changed files with 754 additions and 349 deletions.
37 changes: 31 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ Arxiv, 2024. [**[Project Page]**](https://monst3r-project.github.io/) [**[Paper]
[![Watch the video](assets/fig1_teaser.png)](https://monst3r-project.github.io/files/teaser_vid_v2_lowres.mp4)

## TODO
- [x] Release model weights on [Google Drive](https://drive.google.com/file/d/1Z1jO_JmfZj0z3bgMvCwqfUhyZ1bIbc9E/view?usp=sharing) and [Hugging Face](https://huggingface.co/Junyi42/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt)
- [x] Release model weights on [Google Drive](https://drive.google.com/file/d/1Z1jO_JmfZj0z3bgMvCwqfUhyZ1bIbc9E/view?usp=sharing) and [Hugging Face](https://huggingface.co/Junyi42/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt) (10/07)
- [x] Release inference code for global optimization (10/18)
- [x] Release 4D visualization code (10/18)
- [x] Release training code & dataset preparation (10/19)
- [ ] Release evaluation code (est. time: 10/21)
- [ ] Gradio Demo (est. time: 10/28)
- [x] Release evaluation code (10/20)
- [ ] Gradio Demo

## Getting Started

Expand Down Expand Up @@ -102,9 +102,34 @@ python viser/visualizer_monst3r.py --data demo_tmp/lady-running
# to remove the floaters of foreground: --init_conf --fg_conf_thre 1.0 (thre can be adjusted)
```

### Training
## Evaluation

First, please refer to the [prepare_training.md](data/prepare_training.md) for preparing the pretrained models and training/evaluation datasets.
We provide here an example of joint dense reconstruction and camera pose estimation on the **DAVIS** dataset.

First, download the dataset:
```bash
cd data; python download_davis.py; cd ..
```

Then, run the evaluation script:
```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=davis --output_dir="results/davis_joint"
# To use the ground truth dynamic mask, add: --use_gt_mask
```

You could then use the `viser` to visualize the results:
```bash
python viser/visualizer_monst3r.py --data results/davis_joint/bear
```

#### For the complete scripts to evaluate the camera pose / video depth / single-frame depth estimation on the **Sintel**, **Bonn**, **KITTI**, **NYU-v2**, **TUM-dynamics**, **ScanNet**, and **DAVIS** datasets. Please refer to the [evaluation_script.md](data/evaluation_script.md) for more details.


## Training

Please refer to the [prepare_training.md](data/prepare_training.md) for preparing the pretrained models and training/testing datasets.

Then, you can train the model using the following command:
```bash
Expand Down Expand Up @@ -133,4 +158,4 @@ If you find our work useful, please cite:
```

## Acknowledgements
Our code is based on [DUSt3R](https://github.com/naver/dust3r) and [CasualSAM](https://github.com/ztzhang/casualSAM), our camera pose estimation evaluation script is based on [LEAP-VO](https://github.com/chiaki530/leapvo), and our visualization code is based on [Viser](https://github.com/nerfstudio-project/viser). We thank the authors for their excellent work!
Our code is based on [DUSt3R](https://github.com/naver/dust3r) and [CasualSAM](https://github.com/ztzhang/casualSAM), our camera pose estimation evaluation script is based on [LEAP-VO](https://github.com/chiaki530/leapvo), and our visualization code is based on [Viser](https://github.com/nerfstudio-project/viser). We thank the authors for their excellent work!
1 change: 0 additions & 1 deletion data/download_sintel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,3 @@ cd ..
# conda activate monst3r
# cd ..
# python datasets_preprocess/sintel_get_dynamics.py --threshold 0.1 --save_dir dynamic_label_perfect
# python datasets_preprocess/sintel_get_dynamics.py --continuous --save_dir dynamic_label_continuous
171 changes: 171 additions & 0 deletions data/evaluation_script.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Dataset Preparation for Evaluation

We provide scripts to download and prepare the datasets for evaluation. The datasets include: **Sintel**, **Bonn**, **KITTI**, **NYU-v2**, **TUM-dynamics**, **ScanNetv2**, and **DAVIS**.

> [!NOTE]
> The scripts provided here are for reference only. Please ensure you have obtained the necessary licenses from the original dataset providers before proceeding.

## Download Datasets

### Sintel
To download and prepare the **Sintel** dataset, execute:
```bash
cd data
bash download_sintel.sh
cd ..

# (optional) generate the GT dynamic mask
cd ..
python datasets_preprocess/sintel_get_dynamics.py --threshold 0.1 --save_dir dynamic_label_perfect
```

### Bonn
To download and prepare the **Bonn** dataset, execute:
```bash
cd data
bash download_bonn.sh
cd ..

# create the subset for video depth evaluation, following depthcrafter
cd datasets_preprocess
python prepare_bonn.py
cd ..
```

### KITTI
To download and prepare the **KITTI** dataset, execute:
```bash
cd data
bash download_kitti.sh
cd ..

# create the subset for video depth evaluation, following depthcrafter
cd datasets_preprocess
python prepare_kitti.py
cd ..
```

### NYU-v2
To download and prepare the **NYU-v2** dataset, execute:
```bash
cd data
bash download_nyuv2.sh
cd ..

# prepare the dataset for depth evaluation
cd datasets_preprocess
python prepare_nyuv2.py
cd ..
```

### TUM-dynamics
To download and prepare the **TUM-dynamics** dataset, execute:
```bash
cd data
bash download_tum.sh
cd ..

# prepare the dataset for pose evaluation
cd datasets_preprocess
python prepare_tum.py
cd ..
```

### ScanNet
To download and prepare the **ScanNet** dataset, execute:
```bash
cd data
bash download_scannetv2.sh
cd ..

# prepare the dataset for pose evaluation
cd datasets_preprocess
python prepare_scannet.py
cd ..
```

### DAVIS
To download and prepare the **DAVIS** dataset, execute:
```bash
cd data
python download_davis.py
cd ..
```

## Evaluation Script (Video Depth)

### Sintel

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=sintel --output_dir="results/sintel_video_depth" --full_seq
```

The results will be saved in the `results/sintel_video_depth` folder. You could then run the corresponding code block in [depth_metric.ipynb](../depth_metric.ipynb) to evaluate the results.

### Bonn

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=bonn --output_dir="results/bonn_video_depth"
```

The results will be saved in the `results/bonn_video_depth` folder. You could then run the corresponding code block in [depth_metric.ipynb](../depth_metric.ipynb) to evaluate the results.

### KITTI

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=kitti --output_dir="results/kitti_video_depth"
```

The results will be saved in the `results/kitti_video_depth` folder. You could then run the corresponding code block in [depth_metric.ipynb](../depth_metric.ipynb) to evaluate the results.

## Evaluation Script (Camera Pose)

### Sintel

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=sintel --output_dir="results/sintel_pose"
# To use the ground truth dynamic mask, add: --use_gt_mask
```

The evaluation results will be saved in `results/sintel_pose/_error_log.txt`.

### TUM-dynamics

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=tum --output_dir="results/tum_pose"
```

The evaluation results will be saved in `results/tum_pose/_error_log.txt`.

### ScanNet

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=scannet --output_dir="results/scannet_pose"
```

The evaluation results will be saved in `results/scannet_pose/_error_log.txt`.

## Evaluation Script (Single-Frame Depth)

### NYU-v2

```bash
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_depth \
--pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth" \
--eval_dataset=nyu --output_dir="results/nyuv2_depth"
```

The results will be saved in the `results/nyuv2_depth` folder. You could then run the corresponding code block in [depth_metric.ipynb](../depth_metric.ipynb) to evaluate the results.
5 changes: 3 additions & 2 deletions data/prepare_training.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@

We provide scripts to prepare datasets for training, including **PointOdyssey**, **TartanAir**, **Spring**, and **Waymo**. For evaluation, we also provide a script for preparing the **Sintel** dataset.

*Please ensure you have obtained the necessary licenses from the original dataset providers before proceeding.*
> [!NOTE]
> The scripts provided here are for reference only. Please ensure you have obtained the necessary licenses from the original dataset providers before proceeding.
## Download Pre-Trained Models
To download the pre-trained models, run the following commands:
Expand Down Expand Up @@ -69,4 +70,4 @@ To download and prepare the **Sintel** dataset for evaluation, execute:
cd data
bash download_sintel.sh
cd ..
```
```
69 changes: 0 additions & 69 deletions datasets_preprocess/bonn.ipynb

This file was deleted.

Loading

0 comments on commit 6117b33

Please sign in to comment.