-
-
Notifications
You must be signed in to change notification settings - Fork 23.4k
FTI - Optimize SceneTree
traversal
#106244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FTI - Optimize SceneTree
traversal
#106244
Conversation
3397f0f
to
2392382
Compare
Rebased after #106224 . |
Just wildly throwing out some more names for review now as @rburing may not be available, and we should really look at getting this merged before feature freeze for 4.5. Anyone welcome to review / test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally on all 3D demos that use physics, it works as expected (at 20 tps). Code looks good to me.
We should probably remove the Legacy codepath before 4.5 is released unless there's a good reason to keep it selectable as an option. The change should be 100% transparent for users anyway (nonwithstanding bugs).
2392382
to
c7764ef
Compare
Yup we should be able to, it's just there as a temporary hedge for regressions really. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I couldn't review earlier, I was away due to family circumstances.
This optimization strategy makes sense and the code looks reasonable to me.
Thanks @Calinou for the extra testing.
Thanks! |
@lawnjelly do you know if this would affect how physics interpolation would interact with a Building the latest It seems like the |
@TranquilMarmot Yes I'm fully expecting one or two regressions that I've missed, that's party why I added the legacy option. Do feel free to create a separate issue for this with a minimum reproduction project (MRP) rather than tagging onto the PR, as it's easier to manage and other people will see it who might have the same problem. I'll start investigating when I get your MRP. 👍 EDIT: Trucktown as I thought works fine (it was used quite a bit in testing) and that has |
I wasn't sure where to ask, but I'm running into the "scene_tree_depth_limit". Is this value exposed someplace? I was able to run my scene in Legacy mode, and also when I changed the default from 32 to 64 in C++ it runs correctly. Without this change the engine crashes at line 314 in "scene_tree_fti.cpp", "dest_list.push_back(s);". |
I'll take a look, there could be a bug in the handling of this situation (as it's pretty rare).
I'll see if I can fix this up. |
This takes the
SceneTreeFTI
and optimizes it up to eleven.Benchmarking shows this to be approx 2-10x faster for the physics interpolation.
Fixes #106185
Forward port of #105728
N.B. existing un-optimized method is still available, making this PR very safe
It is possible to switch between this new optimized implementation and the old reference implementation via project setting
physics/3d/physics_interpolation/scene_traversal
, which makes this far less risky to introduce as users can always change to the existing method if there are any regressions.MRP
For the MRP in #106185, fps with physics interpolation ON is now 360fps, matching fps with physics interpolation OFF.
The physics interpolation itself I've timed as > 2500x faster. 👍 😎
That's fairly extreme, but represents the performance improvement in static parts of the scene.
These are the debug log results for the MRP:
Introduction
The original implementation for scene tree traversal in
SceneTreeFTI
was naive but foolproof, I had always intended to write a more optimized version, but didn't want to make the original PR too hard to review / understand.Optimization is a trade off - it offers better performance, at a cost of readability and complexity, so we normally reserve it for bottleneck areas. The
SceneTreeFTI
is such a bottleneck area (like rendering or physics), and so some trade offs are made here for performance. Fair warning it is not for the faint of heart.How it works
Instead of naively traversing the entire scene tree, we only traverse the branches that have changed.
As nodes are moved we record them in frame xform lists for later processing. We ideally want to sort these nodes by depth, as traversing down from the higher nodes (low
depth
, close to root) will prevent duplication of branch processing from lower nodes.Instead of sorting with e.g. quicksort, we maintain a fixed number of depth layers, and just place the node in the corresponding depth layer. Then as we process the nodes on the frame, we can do it in depth order with no sorting required. If we blow past the max layers, this is not a problem, it just might do a little more processing.
Further optimizations done in this PR
Debugging / Testing
3 modes are offered via a (temporary?) project setting:
Additionally there are new debugging compile defines:
GODOT_SCENE_TREE_FTI_PRINT_TREE
- prints the nodes processed.GODOT_SCENE_TREE_FTI_VERIFY
- kind of like a unit test, it uses both methods, and tests that the optimized result is the same as the naive full tree result.Testing / verification code
Verifying the results of the optimized path are the same as the reference path is itself rather complex, and is implemented here as a separate file, which will be compiled out in regular builds.
The tests file contains a duplicate of the traversal code. This makes both easier to read and understand, although it does mean the test would need to be kept in sync with any changes to the regular path if it to still do its job.
I did start by keeping both paths in the same function using
#ifdefs
, but it was becoming unreadable (for me, let alone reviewers), so on balance I have gone for a separate file. Another alternative would be to remove the testing code from the main Godot repo and do this independently, but it is kind of nice that anyone can easily run the testing just by definingGODOT_SCENE_TREE_FTI_VERIFY
.Example debug logging when DEBUG is set in the project setting
This shows how many nodes were touched, how many processed, and timings.
Additionally, what should be a very useful function, it lists all nodes that were moved during the frame. When using physics interpolation most nodes should be moved during the physics tick, and moving during the frame is normally a user error unless that branch has been switched to
physics_interpolation_mode
OFF.This allows users to quickly track down which nodes in their scene might be causing problems with physics interpolation. Particularly useful when converting existing game projects, especially ones you have not authored.
Notes
SceneTreeFTITests
, but I wasn't sure whether putting it inmain/tests
was a good idea if that folder uses auto-generation of tests (as this is nothing like the unit tests).