-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Optimize Visibility Systems #22226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Visibility Systems #22226
Conversation
…ng for tracking previous visibility
|
This needs a comments pass - visibility comments are pretty out of date, and this makes it worse. |
|
Just grokked some things about why the way the code is the way it is. Need to fix some things and trying to make it way less confusing. |
This reverts commit cb419e4.
|
Some overall frametime numbers from Griffin on discord |
|
Verified this exact same behavior reproduces on a lower CPU spec windows machine as well. This confirms that bimodal behavior seemingly coming from the hash set is not platform specific. Saw a 8.25ms main -> 7.64ms frametime improvement once I upped the parallelism of the thread pool. |
|
Is the only transitions that needs to be verified is changing from visible to hidden? The inverse (hidden to visible) is not needed? |
|
What do you mean by "verified"? Do these docs help? (on |
|
maybe would having a N-M relationship (when available) that links entities to cameras replace that component? #[derive(Component)]
#[relationship(relationship_target = VisibleIn)]
struct Viewing(HashSet<Entity>);
#[derive(Component)]
#[relationship_target(relationship = Viewing)]
struct VisibleIn(HashSet<Entity>);and checking if the entity has |
|
What does using a relationship provide here? I don't follow the purpose of your suggestion. Revisiting your original question, I think I understand what you're asking now
No, we need to prevent all spurious change triggers, the This is challenging to do efficiently, and especially in a parallelizable way. The docs explain why we need to track this state in a scratch space, and doing so in a component makes parallel updates easy compared to a single global hash set. It's not immediately clear how using relations would help with that.
I don't think so, Also, as far as I've seen in the past, relationship updates are extremely slow compared to dense parallel component updates. |
…t spurious change detection with trait.
|
Ok, that cleared my question |
# Objective - Add a stress test that exercises the 3d mesh pipeline for dynamic scenes. - Large static scenes like caldera hotel don't expose performance issues when many meshes are moving. - Give us a way to benchmark PRs like - #22297 - #22281 - #22226 ## Solution - Make a 3d version of `bevymark`, sticking to the existing patterns as closely as possible. ## Testing <img width="1072" height="684" alt="image" src="https://github.com/user-attachments/assets/41214ba9-ffad-471d-a320-1f007490dead" /> --------- Co-authored-by: Carter Anderson <mcanders1@gmail.com>
alice-i-cecile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple enough, and those numbers are great. Thanks!
pcwalton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, only one minor nit.
# Objective - Make visibility systems less slow. - Saves as much as 1.2ms of CPU time on a (GPU bound) 6.7ms frame, rendering caldera hotel. - Fixes #22256 ## Solution - Replace the `EntityHashSet` with a component added directly to entities. This still allows for correct change detection triggers on visibility, but avoids hashing. This also enables parallel updates. ## Summary - This PR lead to a rabbit hole that uncovered #22256 - This PR resolves the regression introduced since 0.17 released - A more fair comparison is not to main (which has a regression), but to 0.17 - Compared to 0.17.3, this is a 27% speedup to the `PostUpdate` schedule for caldera hotel on an M4 Max <img width="558" height="251" alt="image" src="https://github.com/user-attachments/assets/2d711e3c-b65e-4c3a-9d2a-a587f8f36aae" /> ## Testing - Many cubes, many foxes, caldera - Yellow is new, red is old. - Note the bimodality in the old traces, this was consistently repeatable, and seems to have something to do with `EntityHashSet` and `EntityHashMap`. Worth investigating that further, as I've seen that bimodal behavior before, and blamed it on pcores vs ecores, but I verified that is not happening. - Summary: caldera hotel no longer exhibits bimodal performance with a pathological mode that runs very slowly. Roughly comparing the ~90th percentile performance, the optimized code is about 1.2ms faster. This is particularly significant when you consider this is running on a device that is already hitting 150fps (GPU bound), so it only has a frame budget of 6.7ms. - Notably, visibility checking was the last egregiously performing set of systems in the common hot path of most bevy apps, and no longer stick out as obviously slow in a frame profile: A high level comparison of the change in CPU time by looking at just PostUpdate between old(red) and new(yellow) <img width="1738" height="512" alt="image" src="https://github.com/user-attachments/assets/5252ff0f-d9f3-49d1-a3ce-309bd84d5485" /> ### caldera hotel - all entities in view For this test, I didn't touch the camera, so all entities were in view at all times. `check_visibility` <img width="1262" height="397" alt="image" src="https://github.com/user-attachments/assets/c80c470a-1548-42d2-838c-4e74525a6cb8" /> `reset_view_visibility` <img width="1262" height="407" alt="image" src="https://github.com/user-attachments/assets/7b550823-f832-41f1-9e42-20cab342b0f2" /> `mark_newly_hidden_entities_invisible` This is 23us slower than the "fast mode" of the existing code on main, but 180-250us *faster* than the weird "slow mode" on main. The new code is significantly more stable, and does not exhibit the super slow mode. <img width="1256" height="418" alt="image" src="https://github.com/user-attachments/assets/b7cbf57d-a221-468a-9af8-3381981874b2" /> ### caldera hotel - no entities in view For this test, I immediately rotated the camera so the hotel was not in view. `check_visibility` This is largely a wash. The old code is 2us faster, but this is likely in the realm of noise. <img width="1255" height="372" alt="image" src="https://github.com/user-attachments/assets/2c795f72-ebea-4b35-ba19-a8eccd4c56fc" /> `reset_view_visibility` This is a big win. The main peak is about 30us faster now, but the major win is the worst case, which is nearly 500us faster. <img width="1252" height="398" alt="image" src="https://github.com/user-attachments/assets/f8a9d99f-a6e1-48db-8639-d52276dba9ab" /> `mark_newly_hidden_entities_invisible` Same story as the other caldera comparison with everything in view, this is a bit slower than the fast mode, but way faster than the slow mode on main. <img width="1248" height="400" alt="image" src="https://github.com/user-attachments/assets/4564d94e-911a-4b3f-a584-c6e102ef2dd0" />
# Objective - Fix when an entity becomes visible, it latches and is impossible to hide. - Introduced in #22226 ## Solution - Ensure we clear the current state bit when shifting - Add regression test
# Objective - Fix when an entity becomes visible, it latches and is impossible to hide. - Introduced in #22226 ## Solution - Ensure we clear the current state bit when shifting - Add regression test
Objective
d00240d#22256Solution
EntityHashSetwith a component added directly to entities. This still allows for correct change detection triggers on visibility, but avoids hashing. This also enables parallel updates.Summary
d00240d#22256PostUpdateschedule for caldera hotel on an M4 MaxTesting
EntityHashSetandEntityHashMap. Worth investigating that further, as I've seen that bimodal behavior before, and blamed it on pcores vs ecores, but I verified that is not happening.A high level comparison of the change in CPU time by looking at just PostUpdate between old(red) and new(yellow)
caldera hotel - all entities in view
For this test, I didn't touch the camera, so all entities were in view at all times.
check_visibilityreset_view_visibilitymark_newly_hidden_entities_invisibleThis is 23us slower than the "fast mode" of the existing code on main, but 180-250us faster than the weird "slow mode" on main. The new code is significantly more stable, and does not exhibit the super slow mode.
caldera hotel - no entities in view
For this test, I immediately rotated the camera so the hotel was not in view.
check_visibilityThis is largely a wash. The old code is 2us faster, but this is likely in the realm of noise.
reset_view_visibilityThis is a big win. The main peak is about 30us faster now, but the major win is the worst case, which is nearly 500us faster.
mark_newly_hidden_entities_invisibleSame story as the other caldera comparison with everything in view, this is a bit slower than the fast mode, but way faster than the slow mode on main.