Skip to content

Conversation

@SBCV
Copy link
Contributor

@SBCV SBCV commented Sep 7, 2020

Fix Colmap importer intrinsic conversion and remove redundant depth map conversion code

inline void
depthmap_convert_conventions (typename Image<T>::Ptr dm,
math::Matrix3f const& invproj, bool to_mve)
math::Matrix3f const& invproj, bool to_mve, bool add_pixel_offset)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, why did you add this? This pixel offset is always necessary when converting from image coordinates (where the center of the top left pixel is (0,0)) to 3D image plane coordinates (where the center of the top left pixel is (0.5,0.5)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the following comment in colmap/blob/dev/src/base/reconstruction.cc,

  // Bundler output assumes image coordinate system origin
  // in the lower left corner of the image with the center of
  // the lower left pixel being (0, 0). Our coordinate system
  // starts in the upper left corner with the center of the
  // upper left pixel being (0.5, 0.5).

I understand that the image coordinate system of Colmap also starts at (0.5,0.5). I made some experiments a few weeks ago, which "confirmed" that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: Does MVE offer some way to associate something like view_0006.mve with the original image name? Since the images are also reordered it is not easy to match the results with the original images

Copy link
Owner

@simonfuhrmann simonfuhrmann Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it relevant what convention Colmap uses internally for images? You're inputting images into MVE, and the pixels of that depth image correspond to the pixels of the color image. Now you're treating these images with MVE's convention to convert to a depth map in MVE's convention.

Unless you're telling me that Colmap's depth map is shifted by 0.5 pixels compared to the color image, no change should be necessary here.

Each MVE view has a meta.ini file that has a name key. Typically you put the original image name (such as IMG_123.JPG) into that key. Does that help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Colmap Fork I'm working with provides some matrices, which transform the reparameterized depth maps back into world coordinates. In the following, I'll use the corresponding point clouds as reference (shown in red).

Here, an overlay of the reference point cloud and the point cloud generated with offset=0.0 (green). The reason it does not completely match is the fact that the computations in the reparameterized space are numerically more stable. But one can see that the differences (the noise) appears to be random and non-systematic.

no_offset

Here, an overlay of the reference point cloud and the point cloud generated with offset=0.5 (green)
with_offset

On can clearly see that there is a systematic difference between the point clouds.

I think that makes sense, or am I missing something here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But hold on a second. The offset we're talking about here is solely used to compute the scaling factor for converting depth values. I would assume that this offset has a negligible effect on the depth actually.

Why? With this offset the position of the 3D point changes. And MVE uses the distance to the point as depth, so the value will be different. No?

This referred to the scaling value

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code

  math::Vec3f px((float)x + 0.5f, (float)y + 0.5f, 1.0f);
  px = invproj * px;
  double len = px.norm();
  dm->at(pos) *= (to_mve ? len : 1.0 / len);

the position of px does change with the offset, but it's really only used to scale the depth value (px.norm()), i.e., changes the final point cloud position in viewing direction, not orthogonal to it. Have you tried adding/subtracting 0.5 from the principal point before normalizing it (as per my previous comment). Maybe this fixes this issue you're seeing?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to say, if the 0.5 offset was the problem, there would be more places you'd have to fix.
Most notably this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried adding/subtracting 0.5 from the principal point before normalizing it (as per my previous comment). Maybe this fixes this issue you're seeing

No, but this could / should have the same effect.

This is to say, if the 0.5 offset was the problem, there would be more places you'd have to fix.

I'm afraid that's the case.

Oh boy this importer is killing me ... :D

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel you man. These two things are among the hardest in human history: Converting between conventions, and rotations. Literally the hardest thing in the world is converting rotation conventions.

@SBCV
Copy link
Contributor Author

SBCV commented Sep 9, 2020

Offset reverted and commit squashed

@simonfuhrmann simonfuhrmann merged commit 4aca267 into simonfuhrmann:master Sep 9, 2020
@SBCV SBCV deleted the fix_importer_intrinsics branch September 9, 2020 20:46
@SBCV
Copy link
Contributor Author

SBCV commented Sep 10, 2020

I've conducted some more experiments and I think you are right with

I'm not sure that's the problem we're seeing here.

The previous results were created with a python script, I already had to convert the depth maps to world points.
When using MVE's scene2pset with a single id to create the world points corresponding to the depth map of id, I can not recreate this effect. (Adjusting the principal point (with -0.5f) did actually produce worse results)

In addition, I run some texturing experiments with the Sceaux dataset that showed that adjusting the principal point (with -0.5f) leads to a mesh with untextured faces.

Thus, I think the master branch is already in a good shape.

Sorry for the noise and thank you for merging.

@simonfuhrmann
Copy link
Owner

No worries, I'm glad it works. Just to reiterate on the principal point though. When you subtract 0.5 from it, it has to be before normalization. For example, this

camera_info.ppoint[0] = params[2] / width;

Will become

camera_info.ppoint[0] = (params[2] - 0.5) / width;

@SBCV
Copy link
Contributor Author

SBCV commented Sep 11, 2020

The previous results were created with a python script, I already had to convert the depth maps to world points.
When using MVE's scene2pset with a single id to create the world points corresponding to the depth map of id, I can not recreate this effect. (Adjusting the principal point (with -0.5f) did actually produce worse results)

I left out a lot of information in the paragraph above to keep it short, but reading it again, makes me realize that this paragraph is probably misleading. Here some more background information so you can fully understand the previous / following images:
The Colmap Fork for the reconstruction of satelite images

  • contains camera models with skew parameters
  • produces depth maps in reparameterized space (the values are way smaller than the actual depth map values)

Let us denote such a depth map as depthmap_{sr}, i.e. the depth maps are skewed and contain reparameterized depth values. In order to use the depth maps we need to convert depthmap_{sr} to depth maps with real world depth values, in the following denoted as depthmap_{s} (the depth map is still skewed). The following result (already shown in a post 2 days ago)
image
is the point cloud in world space computed from depthmap_{s}. Reminder: The red point cloud is a reference point cloud directly computed with the colmap fork.

In order to use the depth maps in MVE, we need further to remove the skew factor from the depth maps depthmap_{s} (and of course we also need to remove the skew from the corresponding camera models and images). Let us denote such a skew-free depth map as depthmap_{MVE}.

Now, regarding

For example, this
camera_info.ppoint[0] = params[2] / width;
Will become
camera_info.ppoint[0] = (params[2] - 0.5) / width;

Here are some result with the point cloud generated by scene2pset (using such a skew-free depth map depthmap_{MVE}) and the "reference point cloud". Please ignore the pattern in the following results, I think that is some artifact of the skew removal - the maps must be warped at this step. I need to further investigate this.

Result without principal point adjustment (camera_info.ppoint[0] = params[2] / width;). As you can see sometimes the reference is in front and vice versa.

no_pp_offset

Result with subtraction (camera_info.ppoint[0] = (params[2] - 0.5) / width;). As you can see, the reference is largely in the background.

negative_pp_offset

Result with addition (camera_info.ppoint[0] = (params[2] + 0.5) / width;). As you can see, the reference is largely in the foreground.

positive_pp_offset

Thus, camera_info.ppoint[0] = (params[2] - 0.5) / width; as well as camera_info.ppoint[0] = (params[2] + 0.5) / width; introduce an additional shift, which lead to worse results.

In addition, I run some texturing experiments with the Sceaux dataset that showed that adjusting the principal point (with -0.5f) leads to a mesh with untextured faces.

Here, a result with
camera_info.ppoint[0] = params[2] / width;
no_pp_offset

and a result with
camera_info.ppoint[0] = (params[2] - 0.5) / width;

with_pp_offset

As you can see the result with the offset introduces untextured (white/grey) faces, which further support that camera_info.ppoint[0] = params[2] / width; (and camera_info.ppoint[1] = params[3] / height;) is correct.

Looking at these results, I think we should not touch the principal point.

@SBCV
Copy link
Contributor Author

SBCV commented Sep 11, 2020

So, by using scene2pset with depthmap_{MVE}, I could not reproduce the effect shown in the comparison using depthmap_{s}. (But maybe the reason for this is that the artifacts of the warping have are stronger effect than math::Vec3f px((float)x + 0.5f, (float)y + 0.5f, 1.0f)). So, potentially it could be still there. But like you pointed out earlier

Unless you're telling me that Colmap's depth map is shifted by 0.5 pixels compared to the color image, no change should be necessary here.

that would mean that there is a shift between the image and the depth map, which sounds really unreasonable to me.
So therefore, I thought we should keep the master branch as it is.

However, looking again at the results in math::Vec3f px((float)x + 0.5f, (float)y + 0.5f, 1.0f) makes me just realize how perfectly (!) the point clouds match with this additional shift.

(see results below - copied from a previous post)

image

and
image

But then again, that would mean that there is this (non-intuitive) shift ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants