-
-
Notifications
You must be signed in to change notification settings - Fork 4k
Meshlet BVH Culling #19318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Meshlet BVH Culling #19318
Conversation
Readd software raster
// let world_sphere_radius = lod_sphere.w * world_scale; | ||
// let norm_error = simplification_error / world_sphere_radius * 0.25; | ||
// return norm_error * view.viewport.w < 1.0; | ||
let world_error = simplification_error * world_scale; | ||
let proj = projection[1][1]; | ||
let height = 2.0 / proj; | ||
let norm_error = world_error / height; | ||
return norm_error * view.viewport.w < 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure how correct my change (from the commented code) is, though it seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok did my first pass of this.
A lot of it I just skimmed because there's way too much code for me to verify line by line, so I'm just going to trust a lot of the BVH traversal stuff and resource handling is correct.
Here's what I still need to verify:
- from_mesh() code
- Occlusion culling changes
- LOD selection changes
- Try it out and test perf (my current computer doesn't support 64bit atomics unfortunately)
write_slice(&asset.meshlet_simplification_errors, &mut writer)?; | ||
write_slice(&asset.meshlet_cull_data, &mut writer)?; | ||
writer.write_all(bytemuck::bytes_of(&asset.aabb))?; | ||
writer.write_all(bytemuck::bytes_of(&asset.bvh_depth))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move the instance abbb + bvh_depth to before the FrameEncoder?
"remap_1d_to_2d_dispatch.wgsl", | ||
Shader::from_wgsl | ||
); | ||
embedded_asset!(app, "clear_visibility_buffer.wgsl"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move clear_visibility_buffer to above cull_instances.
let base_bvh_node_index = (buffer_offset / size as u64) as u32; | ||
for (i, &node) in self.iter().enumerate() { | ||
let bytes = bytemuck::cast::<_, [u8; size_of::<BvhNode>()]>(BvhNode { | ||
aabbs: core::array::from_fn(|i| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pull some of this code out into a new function? It's hard to understand imo.
@@ -233,6 +233,7 @@ pub fn extract_meshlet_mesh_entities( | |||
} | |||
|
|||
// Iterate over every instance | |||
// TODO: Switch to change events to not upload every instance every frame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah better instance handling on the CPU is a big todo. I know @tychedelia is also gonna look into caching material specialization for meshlet instances as well.
#import bevy_render::maths::affine3_to_square | ||
|
||
// https://github.com/zeux/meshoptimizer/blob/1e48e96c7e8059321de492865165e9ef071bffba/demo/nanite.cpp#L115 | ||
fn lod_error_is_imperceptible(lod_sphere: vec4<f32>, simplification_error: f32, instance_id: u32) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: I need to review this carefully
} | ||
}; | ||
match &mut resource_manager.cluster_meshlet_ids { | ||
instance_manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test that this is equally as fast as upload_storage_buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, upload_storage_buffer
did the same thing internally so I just got rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might've added that after writing upload_storage_buffer 😅
@@ -48,63 +74,128 @@ struct DrawIndirectArgs { | |||
first_instance: u32, | |||
} | |||
|
|||
struct InstancedOffset { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments here describing what InstancedOffset is, and show some example code of how you map it to cluster/meshlet/triangle/etc?
if any((max_texel >> vec2(mip)) > (min_texel >> vec2(mip)) + 3) { | ||
mip += 1u; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the code we can remove when we fix non-pot HiZ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that would be the + 1u
(and the clamping) above.
} | ||
|
||
@compute | ||
@workgroup_size(128, 1, 1) // 8 threads per node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 node is handled by 8 threads, so 16 nodes per WG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup.
@@ -19,11 +23,13 @@ use meshopt::{ | |||
use metis::{option::Opt, Graph}; | |||
use smallvec::SmallVec; | |||
use thiserror::Error; | |||
use tracing::debug_span; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: I need to review this whole file, haven't looked at it yet
What was wrong with each of these? |
Objective
Solution
Testing