Skip to content

Cache component columns in archetypes #19096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ElliottjPierce
Copy link
Contributor

@ElliottjPierce ElliottjPierce commented May 6, 2025

Objective

This is an alternative to #19063 that uses the existing plan here.

The goal is to reduce the lookups on component maps, maps where ComponentId is the key. This will be really important when these maps move from a sparse set to a hash map to support components as entities.

Solution

Store a new ComponentRecord in the existing ComponentIndex instead of a ArchetypeRecord. This data structure tracks the archetypes that have a given component. Now, they also track the table columns of that component. In a query fetch, use a HashMap to get that ComponentRecord and then index the table columns. (Instead of mapping on each set_table.)

Future work

We can do better than this. If we add some form of WorldQuery::upgrade_state that triggers when any archetype matches the query, then we can skip all component maps in init_fetch.

We should also look into caching the location of the ComponentRecord in ComponentInfo. Right now, to get an arbitrary component without a query, we have to map type id to component id, table id to get column id, and then look up the entity in the column. We could just map the type id to component id and component record id. Then we'd just be indexing twice instead of hash mapping once. Although, World::get like this is uncommon, and a better solution may be to just force users to make a query which will be faster anyway unless they really only want one value ever. But that's blocked on #13358 (which is closed but the closing pr was later reverted IIRC). Bottom line, is we need assets as entities and I believe #18540 before we can safely turn &mut World there to &World without loosing functionality.

Testing

CiI

Performance

Benchmarks
group                                                                                                    main                                     pr19096
-----                                                                                                    ----                                     -------
added_archetypes/archetype_count/1000                                                                    1.11   658.2±84.88µs        ? ?/sec      1.00  592.7±124.61µs        ? ?/sec
added_archetypes/archetype_count/500                                                                     1.37  372.6±108.77µs        ? ?/sec      1.00  272.9±138.86µs        ? ?/sec
all_added_detection/50000_entities_ecs::change_detection::Sparse                                         1.00     39.7±1.09µs        ? ?/sec      1.13     44.8±0.58µs        ? ?/sec
all_added_detection/50000_entities_ecs::change_detection::Table                                          1.50     44.5±0.45µs        ? ?/sec      1.00     29.8±0.28µs        ? ?/sec
all_added_detection/5000_entities_ecs::change_detection::Sparse                                          1.00      4.2±0.14µs        ? ?/sec      1.07      4.5±0.04µs        ? ?/sec
all_added_detection/5000_entities_ecs::change_detection::Table                                           1.48      4.5±0.05µs        ? ?/sec      1.00      3.0±0.03µs        ? ?/sec
all_changed_detection/50000_entities_ecs::change_detection::Table                                        1.50     44.5±0.45µs        ? ?/sec      1.00     29.7±0.36µs        ? ?/sec
all_changed_detection/5000_entities_ecs::change_detection::Table                                         1.48      4.5±0.05µs        ? ?/sec      1.00      3.0±0.12µs        ? ?/sec
build_schedule/100_schedule_no_constraints                                                               1.00   650.9±36.78µs        ? ?/sec      1.34   872.5±26.55µs        ? ?/sec
ecs::entity_cloning::single/clone                                                                        1.00   619.7±97.98ns 1575.9 KElem/sec    1.05   651.0±74.68ns 1500.1 KElem/sec
empty_archetypes/iter/2000                                                                               1.00      8.9±0.94µs        ? ?/sec      1.05      9.4±0.30µs        ? ?/sec
empty_archetypes/par_for_each/10                                                                         1.00      8.1±0.94µs        ? ?/sec      1.07      8.7±0.26µs        ? ?/sec
empty_archetypes/par_for_each/10000                                                                      1.00     21.1±1.26µs        ? ?/sec      1.13     23.7±0.37µs        ? ?/sec
empty_archetypes/par_for_each/2000                                                                       1.00     11.3±1.11µs        ? ?/sec      1.08     12.3±0.36µs        ? ?/sec
empty_commands/0_entities                                                                                1.08      3.8±0.02ns        ? ?/sec      1.00      3.5±0.01ns        ? ?/sec
entity_hash/entity_set_lookup_miss_gen/10000                                                             1.00     67.9±4.93µs 140.4 MElem/sec     1.10     74.9±4.08µs 127.3 MElem/sec
entity_hash/entity_set_lookup_miss_id/10000                                                              1.00     35.6±2.78µs 267.9 MElem/sec     1.11     39.7±5.60µs 240.5 MElem/sec
event_propagation/four_event_types                                                                       1.00    584.3±8.59µs        ? ?/sec      1.21    709.8±4.74µs        ? ?/sec
event_propagation/single_event_type                                                                      1.00   847.6±21.85µs        ? ?/sec      1.12   952.9±23.60µs        ? ?/sec
event_propagation/single_event_type_no_listeners                                                         1.00    247.0±2.22µs        ? ?/sec      2.33    574.9±1.82µs        ? ?/sec
events_send/size_16_events_100                                                                           1.00    123.2±3.55ns        ? ?/sec      1.30   160.3±22.16ns        ? ?/sec
events_send/size_4_events_100                                                                            1.00     81.3±0.99ns        ? ?/sec      2.05    166.6±0.78ns        ? ?/sec
events_send/size_4_events_1000                                                                           1.00    753.6±7.21ns        ? ?/sec      1.08    814.4±4.10ns        ? ?/sec
events_send/size_4_events_10000                                                                          1.00      7.7±0.11µs        ? ?/sec      1.08      8.3±0.02µs        ? ?/sec
events_send/size_4_events_50000                                                                          1.00     38.5±0.66µs        ? ?/sec      1.08     41.5±0.50µs        ? ?/sec
fake_commands/2000_commands                                                                              1.13     12.9±0.02µs        ? ?/sec      1.00     11.5±0.10µs        ? ?/sec
fake_commands/4000_commands                                                                              1.13     25.9±0.04µs        ? ?/sec      1.00     22.9±0.03µs        ? ?/sec
fake_commands/6000_commands                                                                              1.13     38.8±0.07µs        ? ?/sec      1.00     34.4±0.06µs        ? ?/sec
fake_commands/8000_commands                                                                              1.13     51.8±0.07µs        ? ?/sec      1.00     45.9±0.08µs        ? ?/sec
few_changed_detection/50000_entities_ecs::change_detection::Table                                        1.25     57.2±0.76µs        ? ?/sec      1.00     45.9±0.49µs        ? ?/sec
few_changed_detection/5000_entities_ecs::change_detection::Table                                         1.44      5.1±0.25µs        ? ?/sec      1.00      3.5±0.24µs        ? ?/sec
iter_fragmented/base                                                                                     1.16    347.6±7.12ns        ? ?/sec      1.00    299.4±5.37ns        ? ?/sec
iter_fragmented_sparse/base                                                                              1.00      6.6±0.05ns        ? ?/sec      1.13      7.4±0.25ns        ? ?/sec
iter_fragmented_sparse/foreach                                                                           1.00      5.7±0.05ns        ? ?/sec      1.42      8.2±0.28ns        ? ?/sec
iter_fragmented_sparse/foreach_wide                                                                      1.00     37.1±2.14ns        ? ?/sec      1.55     57.6±0.16ns        ? ?/sec
iter_fragmented_sparse/wide                                                                              1.00     40.2±0.34ns        ? ?/sec      1.71     68.9±0.67ns        ? ?/sec
iter_simple/base                                                                                         1.00      7.0±0.04µs        ? ?/sec      1.06      7.4±0.03µs        ? ?/sec
iter_simple/foreach_wide_sparse_set                                                                      1.00     79.8±0.45µs        ? ?/sec      1.05     84.1±0.22µs        ? ?/sec
iter_simple/system                                                                                       1.00      7.0±0.05µs        ? ?/sec      1.08      7.5±0.01µs        ? ?/sec
iter_simple/wide_sparse_set                                                                              1.00     76.9±0.41µs        ? ?/sec      1.12     86.0±0.20µs        ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_10000_entities_ecs::change_detection::Table    1.57   614.2±14.55µs        ? ?/sec      1.00   391.5±44.78µs        ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_1000_entities_ecs::change_detection::Table     1.59     63.9±5.28µs        ? ?/sec      1.00     40.1±1.75µs        ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_100_entities_ecs::change_detection::Sparse     1.12      8.7±0.54µs        ? ?/sec      1.00      7.7±0.14µs        ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_100_entities_ecs::change_detection::Table      1.58      7.4±0.21µs        ? ?/sec      1.00      4.7±0.37µs        ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_10_entities_ecs::change_detection::Sparse      1.10  1042.4±25.56ns        ? ?/sec      1.00   947.6±16.73ns        ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_10_entities_ecs::change_detection::Table       1.40   824.8±10.56ns        ? ?/sec      1.00   587.3±21.08ns        ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_10000_entities_ecs::change_detection::Table     1.61    120.3±1.84µs        ? ?/sec      1.00     74.6±5.21µs        ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_1000_entities_ecs::change_detection::Table      1.58     12.3±0.29µs        ? ?/sec      1.00      7.8±0.15µs        ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_100_entities_ecs::change_detection::Table       1.69  1418.0±15.57ns        ? ?/sec      1.00   837.4±16.90ns        ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_10_entities_ecs::change_detection::Table        1.33    168.8±8.94ns        ? ?/sec      1.00   127.0±10.56ns        ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_10000_entities_ecs::change_detection::Table      1.62     29.8±0.30µs        ? ?/sec      1.00     18.5±0.51µs        ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_1000_entities_ecs::change_detection::Table       1.58      3.0±0.04µs        ? ?/sec      1.00  1924.2±37.07ns        ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_100_entities_ecs::change_detection::Table        1.62    358.2±7.57ns        ? ?/sec      1.00   221.7±11.89ns        ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_10_entities_ecs::change_detection::Table         1.10     47.3±2.22ns        ? ?/sec      1.00     42.9±4.64ns        ? ?/sec
none_changed_detection/50000_entities_ecs::change_detection::Table                                       1.62     29.7±0.39µs        ? ?/sec      1.00     18.3±0.89µs        ? ?/sec
none_changed_detection/5000_entities_ecs::change_detection::Table                                        1.60      3.0±0.03µs        ? ?/sec      1.00  1867.6±40.15ns        ? ?/sec
observe/trigger_simple                                                                                   1.10    429.0±4.98µs        ? ?/sec      1.00    390.7±6.81µs        ? ?/sec
query_get/50000_entities_table                                                                           1.00    198.6±1.24µs        ? ?/sec      1.32    262.6±0.68µs        ? ?/sec
query_get_many_10/50000_calls_table                                                                      1.00  1405.4±58.51µs        ? ?/sec      1.58      2.2±0.09ms        ? ?/sec
query_get_many_2/50000_calls_table                                                                       1.00    240.7±0.84µs        ? ?/sec      1.83    440.7±2.60µs        ? ?/sec
query_get_many_5/50000_calls_table                                                                       1.00    626.3±4.86µs        ? ?/sec      1.69   1059.7±6.69µs        ? ?/sec
run_condition/yes/031_systems                                                                            1.00     26.3±0.76µs        ? ?/sec      1.07     28.1±1.55µs        ? ?/sec
run_condition/yes_using_resource/071_systems                                                             1.00     63.6±1.12µs        ? ?/sec      1.05     66.9±2.05µs        ? ?/sec
sized_commands_0_bytes/2000_commands                                                                     1.06     10.8±0.08µs        ? ?/sec      1.00     10.2±0.02µs        ? ?/sec
sized_commands_0_bytes/4000_commands                                                                     1.06     21.6±0.03µs        ? ?/sec      1.00     20.4±0.04µs        ? ?/sec
sized_commands_0_bytes/8000_commands                                                                     1.06     43.3±0.19µs        ? ?/sec      1.00     40.9±0.10µs        ? ?/sec
sized_commands_12_bytes/2000_commands                                                                    1.05     11.7±0.02µs        ? ?/sec      1.00     11.1±0.02µs        ? ?/sec
sized_commands_12_bytes/4000_commands                                                                    1.05     23.3±0.04µs        ? ?/sec      1.00     22.2±0.06µs        ? ?/sec
sized_commands_12_bytes/6000_commands                                                                    1.05     35.2±0.18µs        ? ?/sec      1.00     33.5±0.09µs        ? ?/sec
spawn_commands/6000_entities                                                                             1.00   427.3±37.60µs        ? ?/sec      1.05   449.1±30.62µs        ? ?/sec
world_get/50000_entities_table                                                                           1.00    164.9±1.82µs        ? ?/sec      1.06    175.3±0.73µs        ? ?/sec
world_query_get/50000_entities_table                                                                     1.00    125.0±0.21µs        ? ?/sec      1.73    216.0±1.49µs        ? ?/sec
world_query_get/50000_entities_table_wide                                                                1.00    124.9±0.31µs        ? ?/sec      5.70    711.6±2.87µs        ? ?/sec

Performance is not great.

The only place that is significantly faster than main is change detection, which is only sped up because the column is cached. In other words, the improvement could be trivially replicated without these changes.

In general, performance is slightly worse across the board, and much worse in a few places, especially sparse table queries and Query::get. I want to drive home again that this is just adding an indexing op, that caches a sparse set lookup. I can only imagine what would happen if the sparse set of component ids became a hash map.

@ElliottjPierce ElliottjPierce added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 6, 2025
@ElliottjPierce
Copy link
Contributor Author

ElliottjPierce commented May 6, 2025

Note also that #19063 is generally slightly faster across the board, but not by a lot in most places.

Also, does this need a migration guide? It removes the public ArchetypeRecord but that struct had no public interfaces at all, so nobody could possibly be using it. It also makes ComponentIndex private, but same goes there too. Only way it could be break is if a user had it as an unused import.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant