Custom dataset training #11

VillardX · 2024-03-29T03:15:35Z

Hi, thanks for the great work. I have some questions about custom data training.

In the paper, re10k data training only input 2 context-view rgb images and corresponding intrinsics and extrinsics，and output a novel view rgb.

About the znear and zfar, in “dataset_re10k.py”, it is set 1 and 100. Should znear and zfar be modified if trained on my custom dataset? What 1 and 100 mean? Meter?
About extrinsic and intrinsic, according to pixelsplat, “Our extrinsics are OpenCV-style camera-to-world matrices. This means that +Z is the camera look vector, +X is the camera right vector, and -Y is the camera up vector. Our intrinsics are normalized, meaning that the first row is divided by image width, and the second row is divided by image height.”
I don’t know what the dimension of T vector of extrinsic, is the T vector in meters? And according to your “dataset_re10k.py”, the extrinsic of raw data is “w2c” and you return “w2c.inverse()” as c2w in function “convert_poses()”. Is my understanding correct?
The num of context view is 3 in my custom dataset. In the paper, it is trained with 2 context view. Where can I modify it?
By the way, the paper use MVS cost volume, but the model is mainly trained with 2-input-view setting. Did you try to train with mutiple-input-view-setting?

donydchen · 2024-03-29T07:37:25Z

Hi @VillardX, thanks for your interest in our work.

We empirically set the (near, far) as (1, 100), following our previous work MuRF (see the implementations HERE). If I remember correctly, these two values actually have no strict physical meanings, we just warp the images and find that they fit. Indeed, it needs to be set to other values if you work on other datasets. For example, you can set them according to the COLMAP data if you have it, more references can be found HERE. Or you can follow us to warp the input images to decide if you do not have the COLMAP data, see #4 (comment).

I am not sure whether T is in meters or not (I guess it is also relative value though, since it is reconstructed, not real ground truth). You may refer to the RE10K homepage for more details. Your understanding is correct, the raw data is 'w2c'.

For more information about how to train and test with more views, kindly refer to #4.

donydchen self-assigned this Mar 29, 2024

donydchen closed this as completed Mar 29, 2024

donydchen mentioned this issue Jul 5, 2024

Asking about range of depth map prediction value #46

Open

donydchen mentioned this issue Jul 18, 2024

Question about the scales #50

Open

donydchen mentioned this issue Sep 12, 2024

Some questions about the reproduction and testing process #61

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom dataset training #11

Custom dataset training #11

VillardX commented Mar 29, 2024

donydchen commented Mar 29, 2024

Custom dataset training #11

Custom dataset training #11

Comments

VillardX commented Mar 29, 2024

donydchen commented Mar 29, 2024