FTI - Add `Node::get_children_fast()` for `SceneTreeFTI` #106214

lawnjelly · 2025-05-09T16:59:16Z

Adds new allocation-free Node function for retrieving children from nodes efficiently using LocalVector.

Fixes worst aspect of #106185

MRP

(release build)
Before this PR: 160fps
After this PR: 360fps
Physics Interpolation off: 360fps

Note that this PR likely doesn't solve all performance issues in #106185, but seems to ameliorate a particular problem in 4.x. I had to double check the timings, because the difference is so pronounced.

Discussion

#104269 introduced the new 3D FTI and uses a reference approach traversing the entire scene tree which is expected to not be super fast with large numbers of nodes in the tree. This is to be considerably optimized in #105728 (which only updates branches that have changed) and #105901.

However the reason for most of the slowdown in #106185 was unexpected - it was due to changes made for 4.x in #75627 which seem to make it (very?) inefficient to access children via get_child_count() and get_child(). This problem does not exist in 3.x.

Profiling shows this to be the case (this is debug, but should represent release to an extent):

Investigating suggested that get_child_count() and get_child() could be very inefficient, especially in debug with thread checks.

Notes

I don't use 4.x regularly, so was somewhat surprised by this inefficiency.
Welcome any alternative naming.
We should probably review existing code in 4.x that uses the get_child_count() / get_child() pattern and consider replacing.
The existing Node::get_children() command should probably become a wrapper around get_children_fast() or use a similar approach, because it is likely also inefficient.

clayjohn

Makes sense to me. I didn't run the code locally, but I compared it against the existing get_children() and this appears to do the same thing without most of the overhead.

lawnjelly · 2025-05-09T17:54:03Z

I figured out why the timings were so good! 😁
I forgot an & on the parameter so it was returning 0 each time!

Fixed now, so the improvement won't be quite as good lol.

Tested and it's still loads faster. 👍

Yes the results are still 360fps after the PR (as before), and a benchmark testing just the iteration of the children (20000) shows get_children_fast() is approx 16x faster than the old get_child_count() and get_child() approach.

UPDATE:
Ah, another bug to fix yet in the SceneTreeFTI, just fixing. Can't use a single Vector as the scene tree is recursive.
May have to finish this up tomorrow, with FixedVector or a LocalVector pool.

KoBeWi · 2025-05-09T18:25:42Z

You could skip children cache if you don't care about order of nodes and don't need to differentiate internal nodes.

KoBeWi · 2025-05-09T18:26:23Z

scene/main/scene_tree_fti.cpp

 	if (!s) {
-		for (int n = 0; n < p_node->get_child_count(); n++) {
-			_update_dirty_nodes(p_node->get_child(n), p_current_frame, p_interpolation_fraction, p_active, nullptr, p_depth + 1);
+		for (uint32_t n = 0; n < data.temp_child_list.size(); n++) {


This can be changed to range loop.

kiroxas · 2025-05-09T18:41:00Z

scene/main/node.cpp

+	if (p_include_internal) {
+		uint32_t num_children = data.children_cache.size();
+		r_children.resize(num_children);
+		for (uint32_t n = 0; n < num_children; n++) {


Maybe worth tring a memcpy here, it would have a very good chance of using SIMD for the copy.

lawnjelly · 2025-05-09T20:15:18Z

Actually on reflection based on the recursion problem, I think I might try a simpler approach to this problem tomorrow, and just provide fast access directly to the children cache for SceneTreeFTI, I have a draft PR for this.

It seems it is already cached as a LocalVector, so we can avoid the recursion problem by accessing it directly and avoiding the safety protections which are what is slowing this down.

lawnjelly added this to the 4.5 milestone May 9, 2025

lawnjelly requested a review from a team as a code owner May 9, 2025 16:59

lawnjelly added topic:core topic:rendering topic:3d performance labels May 9, 2025

clayjohn approved these changes May 9, 2025

View reviewed changes

lawnjelly marked this pull request as draft May 9, 2025 17:50

FTI - Add Node::get_children_fast() for SceneTreeFTI

432f7ae

lawnjelly force-pushed the fti_get_children_fast branch from b54b49b to 432f7ae Compare May 9, 2025 17:52

lawnjelly marked this pull request as ready for review May 9, 2025 17:55

lawnjelly marked this pull request as draft May 9, 2025 18:17

KoBeWi reviewed May 9, 2025

View reviewed changes

kiroxas reviewed May 9, 2025

View reviewed changes

lawnjelly closed this May 9, 2025

lawnjelly added the archived label May 9, 2025

lawnjelly mentioned this pull request May 10, 2025

SceneTreeFTI faster access to Node children #106224

Merged

AThousandShips removed this from the 4.5 milestone May 10, 2025

KoBeWi mentioned this pull request May 11, 2025

Rework Node duplicate #106287

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FTI - Add `Node::get_children_fast()` for `SceneTreeFTI` #106214

FTI - Add `Node::get_children_fast()` for `SceneTreeFTI` #106214

Uh oh!

lawnjelly commented May 9, 2025 •

edited

Loading

Uh oh!

clayjohn left a comment

Uh oh!

lawnjelly commented May 9, 2025 •

edited

Loading

Uh oh!

KoBeWi commented May 9, 2025

Uh oh!

KoBeWi May 9, 2025

Uh oh!

kiroxas May 9, 2025

Uh oh!

lawnjelly commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

FTI - Add Node::get_children_fast() for SceneTreeFTI #106214

FTI - Add Node::get_children_fast() for SceneTreeFTI #106214

Uh oh!

Conversation

lawnjelly commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MRP

Discussion

Notes

Uh oh!

clayjohn left a comment

Choose a reason for hiding this comment

Uh oh!

lawnjelly commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KoBeWi commented May 9, 2025

Uh oh!

KoBeWi May 9, 2025

Choose a reason for hiding this comment

Uh oh!

kiroxas May 9, 2025

Choose a reason for hiding this comment

Uh oh!

lawnjelly commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

FTI - Add `Node::get_children_fast()` for `SceneTreeFTI` #106214

FTI - Add `Node::get_children_fast()` for `SceneTreeFTI` #106214

lawnjelly commented May 9, 2025 •

edited

Loading

lawnjelly commented May 9, 2025 •

edited

Loading