Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Viser's perspective projection #254

Open
smart4654154 opened this issue Jul 28, 2024 · 10 comments
Open

Viser's perspective projection #254

smart4654154 opened this issue Jul 28, 2024 · 10 comments

Comments

@smart4654154
Copy link

Viser's perspective projection
I guess Viser uses perspective projection, then changes to Orthographics projection, and performs rendering. If an object is in ftustum, it will be rendered; if an object is not in ftustum, it will not be rendered.
image

Based on my speculation, I used CameraHandle.position,
CameraHandle.fov,
and CameraHandle.aspect.
Then I set up a near plane and far plane. I used the above 5 parameters to creat frustum (red part).
image

Now I am using Nefstudio's gaplat algorithm for training and have obtained many CKPT files. I generate a bounding box for each ckpt.If frustum intersects with the bounding box, I will read the ckpt. If frustum does not intersect with the bounding box, I will not read ckpt.
Now I have discovered a bug:
I use nerfview( https://github.com/hangg7/nerfview ).
When using the left arrow or right arrow on the keyboard to rotate, there may be some missing data in the scene. Because the bounding box of some data does not intersect with the frustum, so it did not participate in rendering.

image

I drew four points ABCD of the far plane in browser

image

Theoretically, points ABCD are located at the four corners of the screen.
image

But when I use the left arrow or right arrow on the keyboard, there are only 2 dots on the screen. Points B and D are outside.

image

My guess is that when I use the left arrow or right key on the keyboard, Frustum rotates, but something doesn't rotate together, causing misalignment, so some data is not displayed.
image

I tried my best to solve the bug, but I failed. Perhaps because I didn't know some key information about Viser.
Can you help me? Thank you very much.

@brentyi
Copy link
Collaborator

brentyi commented Jul 30, 2024

Hello!

Yeah, unfortunately this just seems like a bug in the frustum math on your end. Maybe double-check that you're using the camera rotation correctly when you compute the frustum bounds? From what you've drawn it looks like the frustum you've computed is just not rotating correctly with the camera.

@smart4654154
Copy link
Author

You are right,thank you.
when I use the left arrow or right key on the keyboard, something rotate, but Frustum rotates doesn't together, causing misalignment, so some data is not displayed.
This means that during rotation, frustum maintains its position without moving.
image

So the fundamental reason is that the 5 parameters(CameraHandle.position,
CameraHandle.fov,CameraHandle.aspect, a near plane and far plane) I used are not enough.
These 5 parameters can draw many different directions of frustum.
image

I should consider rotation. Do you know how to obtain this parameter in Viser?
Thank you very much.

@brentyi
Copy link
Collaborator

brentyi commented Jul 30, 2024

Makes sense! Yeah, the rotation is stored as a quaternion here:
https://viser.studio/latest/camera_handles/#viser.CameraHandle.wxyz

@smart4654154
Copy link
Author

Thank you very much for your answer, but I am unable to use quaternions appropriately.
The way I get the coordinates of frustum:
1.To simplify the problem, I use Rectangular Pyramid instead of Frustum
2. Draw an upward Rectangular Pyramid with its vertices located at camera.position
3. Generate a look- vector=camera_date.luok_at-camera_stance. position
4. Calculate the rotation matrix based on (0,0,1) and the look-at vector.
5. Apply the rotation matrix to the four base vertices of the Rectangular Pyramid. look-at vector becomes the rotation axis of the Rectangular Pyramid.
image
The problem is that there can be countless Rectangular Pyramids on a rotation axis, and I need a rotation angle to determine the unique Rectangular Pyramid.
image

MY code:
def rotation_matrix_from_vectors(vec1, vec2):
""" Find the rotation matrix that aligns vec1 to vec2
:param vec1: A 3d "source" vector
:param vec2: A 3d "destination" vector
:return mat: A transformation matrix that aligns vec1 with vec2

a, b = (vec1 / np.linalg.norm(vec1)).reshape(3), (vec2 / np.linalg.norm(vec2)).reshape(3)
v = np.cross(a, b)
c = np.dot(a, b)
s = np.linalg.norm(v)
if s == 0:  # the vectors are parallel
    if c > 0:
        return np.identity(3)
    else:
        return -np.identity(3)
I = np.identity(3)
Vx = np.array([[0, -v[2], v[1]], [v[2], 0, -v[0]], [-v[1], v[0], 0]])
R = I + Vx + np.dot(Vx, Vx) * ((1 - c) / (s ** 2))
return R

def draw_frustum( x_center, y_center, z_center, fov, aspect_ratio, far, direction,wxyz=None):
fov_rad = fov#

far_height = 2 * np.tan(fov_rad / 2) * far
far_width = far_height * aspect_ratio




# Far plane vertices
far_plane = np.array([
    [-far_width / 2, -far_height / 2, far],
    [far_width / 2, -far_height / 2, far],
    [far_width / 2, far_height / 2, far],
    [-far_width / 2, far_height / 2, far]
])
far_plane += np.array([x_center, y_center, z_center])
# print(far_plane.shape)
# Rotation to align the frustum with the direction vector
direction = np.array(direction)
forward = np.array([0, 0, 1])  # Initial forward direction
R = rotation_matrix_from_vectors(forward, direction)



rotated_base_points = []
for point in far_plane:
    vector = point - np.array([x_center, y_center, z_center])
    rotated_vector = R @ vector
    rotated_point = rotated_vector + np.array([x_center, y_center, z_center])
    rotated_base_points.append(rotated_point)
far_plane=np.stack(rotated_base_points,axis=0)


# # Create the sides of the frustum
camera_position=np.array([x_center, y_center, z_center])
frustum = np.vstack((camera_position, far_plane))

How do I use quaternions correctly? Or should I choose a different way of drawing a Rectangular Pyramid? Or does Viser have a parameter that provides me with the rotation angle for the fifth step?
Thank you very much.

@brentyi
Copy link
Collaborator

brentyi commented Aug 1, 2024

Okay interesting!

Instead of trying to rotate the frustum into the world frame, for me it's easier to visualize leaving the frustum at the camera frame origin but then transform the Gaussians so they're defined with respect to the camera. My high-level recommendation would be to:

  • Ignore the look_at + up_direction attributes of the camera. Using wxyz is probably easier.
  • Convert camera.wxyz and camera.position to a single rigid transform. T_cam_world = vtf.SE3.from_rotation_and_translation(vtf.SO3(wxyz), position).inverse() can give you the necessary transformation to put world-frame coordinates into the camera frame. 1
  • Use T_cam_world to compute camera-frame Gaussian centers from world-frame Gaussian centers. pseudocode: centers_wrt_cam = T_cam_world @ centers_wrt_world
  • Define frustum using fov parameters, with +Z forward, +X right, +Y down. Check whether the points are in the frustum.

Footnotes

  1. note that wxyz and position for the camera both correspond to T_world_cam, so we need to invert. https://viser.studio/latest/conventions/#poses is also relevant.

@smart4654154
Copy link
Author

thank you
Are you saying that I transform the position of a Gaussian from the world coordinate system to the camera coordinate system? Then determine whether the Gaussian point of the camera coordinate system is inside or outside the frustum of the camera coordinate system?
So the question is how do I get a precise location of the frustum in the camera coordinate system, like in this picture. the Gaussian point is in the black frustum but not in the red frustum.I still need rotation.
image

Or is there no rotation in the frustum of the camera coordinate system?like in this picture.
image
I can directly write the coordinates,like this:
far_plane = np.array([
[-far_width / 2, -far_height / 2, far],
[far_width / 2, -far_height / 2, far],
[far_width / 2, far_height / 2, far],
[-far_width / 2, far_height / 2, far]
])
thank you

@smart4654154
Copy link
Author

I found that the photos added by add_camera_frustum in ns-viewer seem to overlap with the results rendered by splatfacto, perhaps the way frustum is drawn in add_camera_frustum is helpful to me.
image
image

@brentyi
Copy link
Collaborator

brentyi commented Aug 3, 2024

Are you saying that I transform the position of a Gaussian from the world coordinate system to the camera coordinate system? Then determine whether the Gaussian point of the camera coordinate system is inside or outside the frustum of the camera coordinate system?

Yes!

So the question is how do I get a precise location of the frustum in the camera coordinate system, like in this picture. the Gaussian point is in the black frustum but not in the red frustum.I still need rotation.

The camera frustum in the camera frame should have no rotation applied to it, since it's rigidly attached to the camera frame. So I think this should actually be easier than you'd expect.

Here's an approach that looks correct to me:
image

@brentyi
Copy link
Collaborator

brentyi commented Aug 3, 2024

This also seems correct with a more common (in computer vision) intrinsics matrix:
image

@smart4654154
Copy link
Author

This also seems correct with a more common (in computer vision) intrinsics matrix: image

That's right, this method can determine whether a point is in the frustum, which is used in the 3DGS code
image

Chatgpt is very clever. I found the missing parameter (up-direction), which allows me to locate the only frustum.
image
code:
def calculate_frustum_corners(eye, direction, up, fov, aspect_ratio, near, far):
#
eye = np.array(eye)
#
direction = np.array(direction) / np.linalg.norm(direction)
#
up = np.array(up) / np.linalg.norm(up)
#
right = np.cross(direction, up)
right = right / np.linalg.norm(right)
#
up = np.cross(right, direction)

tan_fov = np.tan(np.radians(fov / 2))
near_height = 2 * tan_fov * near
near_width = near_height * aspect_ratio
far_height = 2 * tan_fov * far
far_width = far_height * aspect_ratio


near_center = eye + direction * near
far_center = eye + direction * far


corners = {}
corners['Near Top Left'] = near_center + (up * (near_height / 2)) - (right * (near_width / 2))
corners['Near Top Right'] = near_center + (up * (near_height / 2)) + (right * (near_width / 2))
corners['Near Bottom Left'] = near_center - (up * (near_height / 2)) - (right * (near_width / 2))
corners['Near Bottom Right'] = near_center - (up * (near_height / 2)) + (right * (near_width / 2))
corners['Far Top Left'] = far_center + (up * (far_height / 2)) - (right * (far_width / 2))
corners['Far Top Right'] = far_center + (up * (far_height / 2)) + (right * (far_width / 2))
corners['Far Bottom Left'] = far_center - (up * (far_height / 2)) - (right * (far_width / 2))
corners['Far Bottom Right'] = far_center - (up * (far_height / 2)) + (right * (far_width / 2))

return corners

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants