-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[Merged by Bors] - Dynamic light clusters #3968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
With my Sponza 256 point light test scene, I used to not be able to use larger point light ranges, so I used 1m. With that configuration I observe no significant loss in performance just looking at FPS. If I used light ranges of 2m, I'd hit the cluster light limits. Now I can increase it to 2m and it will continue to work. The frame rate drops significantly but then more lights are affecting more fragments on the screen so that's not surprising. Here's a video with the cluster coherency debug enabled so you can see what is going on: Chatting on Discord, @robtfm suggested bringing the camera far plane closer. I noted that the camera far plane made sense to use as the far bound of the furthest depth slice, but it only has to be as far as the furthest light sphere. We could evaluate the near/far based on where the closest/furthest light sphere bounds are and then get better use of the available clusters which should improve performance a bit. Unless it hurts the performance due to causing more light sphere - cluster intersections. But that can probably be optimised some more. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Robert Swain <robert.swain@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a suggestion for top-down games.
Co-authored-by: Robert Swain <robert.swain@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The handling of and comments about view z for orthographic look wrong to me. It mostly looks good though. :)
crates/bevy_pbr/src/light.rs
Outdated
@@ -479,6 +590,98 @@ fn ndc_position_to_cluster( | |||
.clamp(UVec3::ZERO, cluster_dimensions - UVec3::ONE) | |||
} | |||
|
|||
// Calculate an AABB for the light in view space, returns a (Vec3, Vec3) containing min and max with | |||
// - x and y in view space with range [-1, 1] | |||
// - z in world space with view orientation, with range [-inf, -0.0001] for perspective, and [1.0000, inf] for orthographic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// - z in world space with view orientation, with range [-inf, -0.0001] for perspective, and [1.0000, inf] for orthographic | |
// z in view space with range [-inf, -f32::EPSILON] for perspective, and [1.0000, inf] for orthographic |
Also, the difference in ranges between perspective and orthographic is confusing to me as the coordinate space axis orientation is the same (i.e. view z is not flipped.) View z in front of the camera is always negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worded the comment slightly differently, and removed the distinction for orthographic in this function.
crates/bevy_pbr/src/light.rs
Outdated
// constraint z to be negative - i.e. in front of the camera | ||
light_aabb_view_min.z = light_aabb_view_min.z.min(-0.0001); | ||
light_aabb_view_max.z = light_aabb_view_max.z.min(-0.0001); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// constraint z to be negative - i.e. in front of the camera | |
light_aabb_view_min.z = light_aabb_view_min.z.min(-0.0001); | |
light_aabb_view_max.z = light_aabb_view_max.z.min(-0.0001); | |
// Constrain view z to be negative - i.e. in front of the camera | |
// When view z is >= 0.0 and we're using a perspective projection, bad things happen. | |
// At view z == 0.0, ndc x,y are mathematically undefined. At view z > 0.0, i.e. behind the camera, | |
// the perspective projection flips the directions of the axes. This breaks assumptions about | |
// use of min/max operations as something that was to the left in view space is now returning a | |
// coordinate that for view z in front of the camera would be on the right, but at view z behind the | |
// camera is on the left. So, we just constrain view z to be < 0.0 and necessarily in front of the camera. | |
light_aabb_view_min.z = light_aabb_view_min.z.min(-f32::EPSILON); | |
light_aabb_view_max.z = light_aabb_view_max.z.min(-f32::EPSILON); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This made sense after I'd understood (a tautology) but I wanted to see it in code too so I quickly hacked up this:
use bevy::{
math::{Vec3Swizzles, Vec4Swizzles},
prelude::*,
};
fn main() {
let view_pos = Vec3::new(-1.0, -1.0, -1.0);
let projection = Mat4::perspective_infinite_reverse_rh(std::f32::consts::FRAC_PI_4, 1.0, 0.1);
let in_front_clip = projection * view_pos.extend(1.0);
let in_front_ndc = in_front_clip.xyz() / in_front_clip.w;
let behind_clip = projection * view_pos.xy().extend(1.0).extend(1.0);
let behind_ndc = behind_clip.xyz() / behind_clip.w;
dbg!(in_front_clip);
dbg!(in_front_ndc);
dbg!(behind_clip);
dbg!(behind_ndc);
}
gives
[examples/3d/3d_scene.rs:13] in_front_clip = Vec4(
-2.4142134,
-2.4142134,
0.1,
1.0,
)
[examples/3d/3d_scene.rs:14] in_front_ndc = Vec3(
-2.4142134,
-2.4142134,
0.1,
)
[examples/3d/3d_scene.rs:15] behind_clip = Vec4(
-2.4142134,
-2.4142134,
0.1,
-1.0,
)
[examples/3d/3d_scene.rs:16] behind_ndc = Vec3(
2.4142134,
2.4142134,
-0.1,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one change left I think, depending on what you think.
Co-authored-by: Robert Swain <robert.swain@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On first pass this looks reasonable. Haven't validated all of the math yet or tested on large scenes, but I'm on board with the concept and the general implementation.
I'll do another pass after lunch :)
Looks good to me! Couldn't get it to produce anything "obviously wrong" when spamming a scene with various moving lights. Great job! |
bors r+ |
# Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
# Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
# Objective provide some customisation for default cluster setup avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump) ## Solution Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants: - None (do not do any light assignment - for views which do not require light info, e.g. minimaps etc) - Single (one cluster) - XYZ (explicit cluster counts in each dimension) - FixedZ (most similar to current - specify Z-slices and total, then x and y counts are dynamically determined to give approximately square clusters based on current aspect ratio) Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup. Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small. notes: - I didn't put ClusterConfig in the camera bundles to avoid introducing a dependency from bevy_render to bevy_pbr. the ClusterConfig enum comes with a pbr-centric impl block so i didn't want to move that into bevy_render either. - ~Might want to add None variant to cluster config for views that don't care about lights?~ - Not well tested for orthographic - ~there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking~ (outdated, i will bring it up to date if required) anecdotal timings: FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded
Objective
provide some customisation for default cluster setup
avoid "cluster index lists is full" in all cases (using a strategy outlined by @superdump)
Solution
Add ClusterConfig enum (which can be inserted into a view at any time) to allow specifying cluster setup with variants:
Defaults to FixedZ { total: 4096, z: 24 } which is similar to the current setup.
Per frame, estimate the number of indices that would be required for the current config and decrease the cluster counts / increase the cluster sizes in the x and y dimensions if the index list would be too small.
notes:
Might want to add None variant to cluster config for views that don't care about lights?there's a cluster_muck branch on my repo which includes some diagnostics / a modified lighting example which may be useful for tyre-kicking(outdated, i will bring it up to date if required)anecdotal timings:
FPS on the lighting demo is negligibly better (~5%), maybe due to a small optimisation constraining the light aabb to be in front of the camera
FPS on the lighting demo with 100 extra lights added is ~33% faster, and also renders correctly as the cluster index count is no longer exceeded