[3.x] FTI - Optimize SceneTree
traversal
#105728
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This takes the
SceneTreeFTI
and optimizes it up to eleven.Benchmarking shows this to be approx 2-10x faster for the physics interpolation.
Introduction
The original implementation for scene tree traversal in
SceneTreeFTI
was naive but foolproof, I had always intended to write a more optimized version, but didn't want to make the original PR too hard to review / understand.Optimization is a trade off - it offers better performance, at a cost of readability and complexity, so we normally reserve it for bottleneck areas. The
SceneTreeFTI
is such a bottleneck area (like rendering or physics), and so some trade offs are made here for performance. Fair warning it is not for the faint of heart.How it works
Instead of naively traversing the entire scene tree, as nodes are moved we record them in frame xform lists for later processing. We ideally want to sort these nodes by depth, as traversing down from the higher nodes (low
depth
, close to root) will prevent duplication of branch processing from lower nodes.Instead of sorting with e.g. quicksort, we maintain a fixed number of depth layers, and just place the node in the corresponding depth layer. Then as we process the nodes on the frame, we can do it in depth order with no sorting required. If we blow past the max layers, this is not a problem, it just might do a little more processing.
Further optimizations done in this PR
Debugging / Testing
3 modes are offered via a (temporary?) project setting:
Additionally there are new debugging compile defines:
GODOT_SCENE_TREE_FTI_PRINT_TREE
- prints the nodes processed.GODOT_SCENE_TREE_FTI_VERIFY
- kind of like a unit test, it uses both methods, and tests that the optimized result is the same as the naive full tree result.Testing / verification code
Verifying the results of the optimized path are the same as the reference path is itself rather complex, and is implemented here as a separate file, which will be compiled out in regular builds.
The tests file contains a duplicate of the traversal code. This makes both easier to read and understand, although it does mean the test would need to be kept in sync with any changes to the regular path if it to still do its job.
I did start by keeping both paths in the same function using
#ifdefs
, but it was becoming unreadable (for me, let alone reviewers), so on balance I have gone for a separate file. Another alternative would be to remove the testing code from the main Godot repo and do this independently, but it is kind of nice that anyone can easily run the testing just by definingGODOT_SCENE_TREE_FTI_VERIFY
.Example debug logging when DEBUG is set in the project setting
This shows how many nodes were touched, how many processed, and timings.
Additionally, what should be a very useful function, it lists all nodes that were moved during the frame. When using physics interpolation most nodes should be moved during the physics tick, and moving during the frame is normally a user error unless that branch has been switched to
physics_interpolation_mode
OFF.This allows users to quickly track down which nodes in their scene might be causing problems with physics interpolation. Particularly useful when converting existing game projects, especially ones you have not authored.
Notes
SceneTreeFTITests
, but I wasn't sure whether putting it inmain/tests
was a good idea if that folder uses auto-generation of tests (as this is nothing like the unit tests).