-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Make entity::index
non max
#18704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make entity::index
non max
#18704
Conversation
@@ -32,6 +32,7 @@ bevy_platform_support = { path = "../crates/bevy_platform_support", default-feat | |||
glam = "0.29" | |||
rand = "0.8" | |||
rand_chacha = "0.3" | |||
nonmax = { version = "0.5", default-features = false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there value in using NonMax over NonZero (I can see the value of banning u32::MAX
from being a valid Entity
but that doesn't require that the missing value be there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do u32
, we just need to document that u32::MAX
is invalid. We can go that route, but that means there won't be a niche in things like TableRow
or anything. (Or at least, it would be more work to convert types, etc.)
If we do u32::NonZero
, we loose both index 0
and u32::MAX
. We still need careful documentation, and we need to make sure we put a dummy value at index 0 anywhere we are using it as an index.
If we do u32::NonZero
but map it to a non max somehow, then we're effectively just reinventing NonMax
.
So out of those options I went with NonMax
here. But that's not to say we can't do otherwise. If you have strong opinions otherwise, feel free to make your case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only called it out since the PR focused solely on wanting a niche but didn't go into why this value was chosen.
IMHO I would lean towards banning 0 as a dummy (many similar setups just use 0 as explicitly none) and either fixing the places that can't handle Max or banning it. I only say that because I think the code here to use NonMax efficiently isn't worth the trade off of preserving 0 that isn't really used.
However I don't think this set of changes is bad and so wouldn't be against this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. We just need to put a niche in the EntityRow
, marking one value as invalid. Then the total number of ids possible is within u32::MAX. That's the goal.
So we could make 0 the niche, create dummy values for every relevant list, communicate that to users, and map 0..u32::MAX onto 1..=u32::MAX in the allocator. That would work. Pro: no op to access the index. Cons: need dummy values + extra allocation + extra math in entity allocator + invalidates some serialized data.
This pr makes the max the niche, keeping everything but the bit layout exactly the same. Pro: very simple. Cons: binary ! to access + invalidates a lot of serialized data.
It's just a trade off. If the consensus is different, I'm 100% fine with that. But at the end of the day, we just need to pick one.
It looks like your PR is a breaking change, but you didn't provide a migration guide. Please review the instructions for writing migration guides, then expand or revise the content in the migration guides directory to reflect your changes. |
Benchmarks are out. Only real perf difference is the bit not in nonmax. We can make that 0 cost once a version of this lands in core. Here's the benches: Benchmarks that differ at allgroup main pr18704
----- ---- -------
add_remove/sparse_set 1.00 557.8±7.00µs ? ?/sec 1.03 571.8±8.91µs ? ?/sec
add_remove/table 1.09 803.9±15.16µs ? ?/sec 1.00 736.7±30.29µs ? ?/sec
add_remove_very_big/table 1.01 34.7±0.57ms ? ?/sec 1.00 34.3±0.61ms ? ?/sec
added_archetypes/archetype_count/1000 1.16 609.6±104.57µs ? ?/sec 1.00 525.2±148.70µs ? ?/sec
added_archetypes/archetype_count/10000 1.00 5.2±0.53ms ? ?/sec 1.10 5.7±0.32ms ? ?/sec
all_added_detection/50000_entities_ecs::change_detection::Sparse 1.00 39.1±0.55µs ? ?/sec 1.03 40.2±0.70µs ? ?/sec
all_added_detection/5000_entities_ecs::change_detection::Table 1.01 4.5±0.03µs ? ?/sec 1.00 4.5±0.07µs ? ?/sec
all_changed_detection/50000_entities_ecs::change_detection::Sparse 1.00 38.0±0.55µs ? ?/sec 1.02 38.6±0.83µs ? ?/sec
all_changed_detection/5000_entities_ecs::change_detection::Sparse 1.00 4.0±0.06µs ? ?/sec 1.03 4.1±0.07µs ? ?/sec
build_schedule/1000_schedule 1.06 1011.4±17.59ms ? ?/sec 1.00 958.1±16.56ms ? ?/sec
build_schedule/100_schedule 1.09 6.3±0.16ms ? ?/sec 1.00 5.8±0.18ms ? ?/sec
build_schedule/100_schedule_no_constraints 1.07 822.0±41.62µs ? ?/sec 1.00 765.5±36.96µs ? ?/sec
build_schedule/500_schedule 1.06 178.7±2.42ms ? ?/sec 1.00 168.7±2.45ms ? ?/sec
build_schedule/500_schedule_no_constraints 1.03 6.6±0.35ms ? ?/sec 1.00 6.4±0.27ms ? ?/sec
contrived/01x_entities_03_systems 1.01 17.4±1.05µs ? ?/sec 1.00 17.2±1.07µs ? ?/sec
contrived/05x_entities_15_systems 1.01 124.2±3.87µs ? ?/sec 1.00 122.9±4.48µs ? ?/sec
despawn_world/1_entities 1.04 173.6±16.12ns ? ?/sec 1.00 166.4±6.77ns ? ?/sec
despawn_world_recursive/10000_entities 1.00 1657.5±39.70µs ? ?/sec 1.01 1676.8±29.70µs ? ?/sec
ecs::entity_cloning::hierarchy_many/clone 1.00 223.0±21.79µs 1594.0 KElem/sec 1.06 236.7±22.48µs 1501.8 KElem/sec
ecs::entity_cloning::hierarchy_tall/reflect 1.02 15.7±1.89µs 3.1 MElem/sec 1.00 15.3±1.28µs 3.2 MElem/sec
ecs::entity_cloning::hierarchy_wide/reflect 1.03 13.2±1.65µs 3.7 MElem/sec 1.00 12.8±0.72µs 3.8 MElem/sec
ecs::entity_cloning::single/clone 1.05 657.4±80.73ns 1485.5 KElem/sec 1.00 628.3±122.54ns 1554.2 KElem/sec
ecs::entity_cloning::single/reflect 1.00 1105.0±64.39ns 883.8 KElem/sec 1.03 1140.6±59.99ns 856.2 KElem/sec
empty_archetypes/for_each/10 1.00 7.6±0.26µs ? ?/sec 1.02 7.8±0.54µs ? ?/sec
empty_archetypes/for_each/100 1.00 7.8±0.22µs ? ?/sec 1.03 8.1±0.73µs ? ?/sec
empty_archetypes/iter/10 1.00 7.6±0.27µs ? ?/sec 1.01 7.7±0.27µs ? ?/sec
empty_archetypes/iter/100 1.02 7.8±0.33µs ? ?/sec 1.00 7.7±0.22µs ? ?/sec
empty_archetypes/iter/1000 1.00 8.2±0.26µs ? ?/sec 1.02 8.4±0.24µs ? ?/sec
empty_archetypes/iter/10000 1.00 9.4±0.41µs ? ?/sec 1.04 9.7±0.67µs ? ?/sec
empty_archetypes/par_for_each/100 1.00 8.2±0.29µs ? ?/sec 1.02 8.4±0.32µs ? ?/sec
empty_commands/0_entities 1.07 3.9±0.06ns ? ?/sec 1.00 3.6±0.06ns ? ?/sec
empty_systems/1000_systems 1.00 1270.0±28.31µs ? ?/sec 1.01 1287.4±40.19µs ? ?/sec
empty_systems/100_systems 1.02 101.3±3.33µs ? ?/sec 1.00 99.6±2.77µs ? ?/sec
empty_systems/10_systems 1.02 10.0±0.51µs ? ?/sec 1.00 9.8±0.63µs ? ?/sec
empty_systems/4_systems 1.00 8.3±0.52µs ? ?/sec 1.02 8.5±0.45µs ? ?/sec
entity_hash/entity_set_build/1000 1.03 3.9±0.11µs 245.7 MElem/sec 1.00 3.8±0.10µs 253.0 MElem/sec
entity_hash/entity_set_build/10000 1.00 45.3±2.10µs 210.7 MElem/sec 1.04 47.0±4.50µs 203.1 MElem/sec
entity_hash/entity_set_build/3162 1.02 12.9±0.51µs 233.4 MElem/sec 1.00 12.7±0.61µs 237.6 MElem/sec
entity_hash/entity_set_lookup_hit/1000 1.00 1272.9±19.85ns 749.2 MElem/sec 1.06 1353.8±20.44ns 704.5 MElem/sec
entity_hash/entity_set_lookup_hit/10000 1.00 15.7±0.71µs 609.2 MElem/sec 1.13 17.6±0.76µs 541.1 MElem/sec
entity_hash/entity_set_lookup_hit/3162 1.00 4.5±0.09µs 677.1 MElem/sec 1.05 4.7±0.07µs 643.0 MElem/sec
entity_hash/entity_set_lookup_miss_gen/10000 1.00 66.2±5.52µs 144.1 MElem/sec 1.04 68.8±6.06µs 138.6 MElem/sec
entity_hash/entity_set_lookup_miss_gen/316 1.00 682.2±9.25ns 441.8 MElem/sec 1.03 699.7±12.03ns 430.7 MElem/sec
entity_hash/entity_set_lookup_miss_gen/3162 1.27 13.5±5.21µs 222.7 MElem/sec 1.00 10.7±1.97µs 282.6 MElem/sec
entity_hash/entity_set_lookup_miss_id/100 1.00 150.4±1.37ns 634.2 MElem/sec 1.13 170.4±1.90ns 559.8 MElem/sec
entity_hash/entity_set_lookup_miss_id/1000 1.00 1369.9±16.37ns 696.1 MElem/sec 1.13 1548.1±26.61ns 616.0 MElem/sec
entity_hash/entity_set_lookup_miss_id/10000 1.00 41.1±5.32µs 231.8 MElem/sec 1.05 43.1±6.57µs 221.2 MElem/sec
entity_hash/entity_set_lookup_miss_id/316 1.00 575.9±7.21ns 523.3 MElem/sec 1.10 632.6±13.41ns 476.4 MElem/sec
entity_hash/entity_set_lookup_miss_id/3162 1.00 9.1±1.41µs 330.8 MElem/sec 1.03 9.4±0.67µs 322.1 MElem/sec
event_propagation/four_event_types 1.00 580.2±8.61µs ? ?/sec 1.20 699.0±7.06µs ? ?/sec
event_propagation/single_event_type 1.00 815.4±24.29µs ? ?/sec 1.12 915.7±17.27µs ? ?/sec
event_propagation/single_event_type_no_listeners 1.00 254.9±7.32µs ? ?/sec 1.10 280.6±4.82µs ? ?/sec
events_send/size_16_events_100 1.04 141.3±25.57ns ? ?/sec 1.00 135.9±24.16ns ? ?/sec
fake_commands/10000_commands 1.10 66.7±1.16µs ? ?/sec 1.00 60.8±1.06µs ? ?/sec
fake_commands/1000_commands 1.10 6.7±0.09µs ? ?/sec 1.00 6.1±0.09µs ? ?/sec
fake_commands/100_commands 1.09 679.4±8.37ns ? ?/sec 1.00 620.9±18.31ns ? ?/sec
few_changed_detection/50000_entities_ecs::change_detection::Sparse 1.00 66.4±2.63µs ? ?/sec 1.19 79.2±1.18µs ? ?/sec
few_changed_detection/50000_entities_ecs::change_detection::Table 1.00 54.7±0.89µs ? ?/sec 1.03 56.5±7.83µs ? ?/sec
few_changed_detection/5000_entities_ecs::change_detection::Sparse 1.00 4.0±0.13µs ? ?/sec 1.33 5.3±0.56µs ? ?/sec
heavy_compute/base 1.00 381.1±14.77µs ? ?/sec 1.01 385.0±12.87µs ? ?/sec
insert_commands/insert 1.06 411.5±34.43µs ? ?/sec 1.00 386.4±13.09µs ? ?/sec
insert_simple/base 1.00 560.3±31.24µs ? ?/sec 1.01 568.2±20.86µs ? ?/sec
insert_simple/unbatched 1.11 1093.4±91.67µs ? ?/sec 1.00 985.8±201.49µs ? ?/sec
iter_fragmented/wide 1.00 2.8±0.06µs ? ?/sec 1.01 2.8±0.05µs ? ?/sec
iter_fragmented_sparse/foreach 1.04 6.0±1.55ns ? ?/sec 1.00 5.8±0.09ns ? ?/sec
iter_fragmented_sparse/foreach_wide 1.01 36.7±1.15ns ? ?/sec 1.00 36.3±0.40ns ? ?/sec
iter_simple/foreach_wide 1.01 17.0±0.21µs ? ?/sec 1.00 16.8±0.13µs ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_1000_entities_ecs::change_detection::Sparse 1.00 74.4±3.84µs ? ?/sec 1.03 76.3±2.79µs ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_100_entities_ecs::change_detection::Sparse 1.02 8.8±0.83µs ? ?/sec 1.00 8.6±0.40µs ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_100_entities_ecs::change_detection::Table 1.03 7.6±0.25µs ? ?/sec 1.00 7.4±0.22µs ? ?/sec
multiple_archetypes_none_changed_detection/100_archetypes_10_entities_ecs::change_detection::Table 1.00 816.2±19.38ns ? ?/sec 1.01 826.6±16.56ns ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_100_entities_ecs::change_detection::Sparse 1.00 1652.9±44.05ns ? ?/sec 1.06 1754.1±54.74ns ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_100_entities_ecs::change_detection::Table 1.00 1416.1±20.68ns ? ?/sec 1.06 1498.7±85.63ns ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_10_entities_ecs::change_detection::Sparse 1.00 189.6±16.89ns ? ?/sec 1.03 195.8±27.17ns ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_10_entities_ecs::change_detection::Table 1.00 169.4±8.23ns ? ?/sec 1.02 173.2±6.77ns ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_100_entities_ecs::change_detection::Sparse 1.00 364.0±35.76ns ? ?/sec 1.03 373.8±31.91ns ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_100_entities_ecs::change_detection::Table 1.00 359.3±8.02ns ? ?/sec 1.01 363.0±37.64ns ? ?/sec
multiple_archetypes_none_changed_detection/5_archetypes_10_entities_ecs::change_detection::Table 1.00 46.8±1.84ns ? ?/sec 1.05 49.0±2.47ns ? ?/sec
no_archetypes/system_count/10 1.00 105.6±1.34ns ? ?/sec 1.03 108.9±1.11ns ? ?/sec
observe/trigger_simple 1.00 349.3±6.29µs ? ?/sec 1.05 368.2±5.56µs ? ?/sec
observe/trigger_targets_simple/10000_entity 1.09 835.4±105.89µs ? ?/sec 1.00 767.5±38.11µs ? ?/sec
query_get/50000_entities_sparse 1.02 145.1±2.04µs ? ?/sec 1.00 141.9±1.43µs ? ?/sec
query_get/50000_entities_table 1.48 200.9±2.66µs ? ?/sec 1.00 135.8±1.96µs ? ?/sec
query_get_many_10/50000_calls_sparse 1.00 1766.7±100.45µs ? ?/sec 1.09 1920.5±113.28µs ? ?/sec
query_get_many_10/50000_calls_table 1.00 1468.8±126.40µs ? ?/sec 1.11 1633.9±76.52µs ? ?/sec
query_get_many_2/50000_calls_sparse 1.00 239.4±5.82µs ? ?/sec 1.37 326.8±3.97µs ? ?/sec
query_get_many_2/50000_calls_table 1.00 242.4±6.75µs ? ?/sec 1.26 305.5±3.88µs ? ?/sec
query_get_many_5/50000_calls_sparse 1.00 663.7±54.86µs ? ?/sec 1.03 684.4±15.83µs ? ?/sec
run_condition/no/1000_systems 1.06 26.7±0.61µs ? ?/sec 1.00 25.2±0.52µs ? ?/sec
run_condition/no/100_systems 1.02 1661.5±41.13ns ? ?/sec 1.00 1636.7±40.61ns ? ?/sec
run_condition/yes/1000_systems 1.00 1229.8±26.25µs ? ?/sec 1.02 1257.4±23.79µs ? ?/sec
run_condition/yes/10_systems 1.03 10.2±0.60µs ? ?/sec 1.00 9.9±0.58µs ? ?/sec
run_condition/yes_using_query/10_systems 1.18 12.0±1.04µs ? ?/sec 1.00 10.2±0.70µs ? ?/sec
run_condition/yes_using_resource/100_systems 1.01 106.1±4.06µs ? ?/sec 1.00 104.9±3.33µs ? ?/sec
sized_commands_0_bytes/10000_commands 1.12 57.5±0.85µs ? ?/sec 1.00 51.5±1.03µs ? ?/sec
sized_commands_0_bytes/1000_commands 1.11 5.8±0.08µs ? ?/sec 1.00 5.2±0.07µs ? ?/sec
sized_commands_0_bytes/100_commands 1.11 583.4±7.94ns ? ?/sec 1.00 525.7±7.64ns ? ?/sec
sized_commands_12_bytes/10000_commands 1.10 68.1±4.66µs ? ?/sec 1.00 62.1±0.78µs ? ?/sec
sized_commands_12_bytes/1000_commands 1.11 6.2±0.12µs ? ?/sec 1.00 5.6±0.10µs ? ?/sec
sized_commands_12_bytes/100_commands 1.10 622.1±9.15ns ? ?/sec 1.00 567.9±9.35ns ? ?/sec
sized_commands_512_bytes/100_commands 1.03 1935.7±666.89ns ? ?/sec 1.00 1873.4±23.88ns ? ?/sec
spawn_commands/10000_entities 1.00 696.6±28.39µs ? ?/sec 1.05 729.5±23.76µs ? ?/sec
spawn_commands/1000_entities 1.00 70.3±8.53µs ? ?/sec 1.07 75.4±17.16µs ? ?/sec
spawn_commands/100_entities 1.00 7.0±0.20µs ? ?/sec 1.08 7.5±1.25µs ? ?/sec
spawn_world/10000_entities 1.02 413.8±57.76µs ? ?/sec 1.00 407.0±32.04µs ? ?/sec
world_entity/50000_entities 1.19 93.0±1.53µs ? ?/sec 1.00 78.0±1.01µs ? ?/sec
world_get/50000_entities_sparse 1.11 163.9±1.87µs ? ?/sec 1.00 147.9±1.44µs ? ?/sec
world_get/50000_entities_table 1.00 164.9±1.83µs ? ?/sec 1.08 177.6±2.08µs ? ?/sec
world_query_for_each/50000_entities_sparse 1.00 38.9±0.43µs ? ?/sec 1.05 40.7±0.44µs ? ?/sec
world_query_iter/50000_entities_sparse 1.00 38.9±0.48µs ? ?/sec 1.05 40.7±0.33µs ? ?/sec Benchmarks that differ at by 5 %group main pr18704
----- ---- -------
add_remove/table 1.09 803.9±15.16µs ? ?/sec 1.00 736.7±30.29µs ? ?/sec
added_archetypes/archetype_count/1000 1.16 609.6±104.57µs ? ?/sec 1.00 525.2±148.70µs ? ?/sec
added_archetypes/archetype_count/10000 1.00 5.2±0.53ms ? ?/sec 1.10 5.7±0.32ms ? ?/sec
build_schedule/1000_schedule 1.06 1011.4±17.59ms ? ?/sec 1.00 958.1±16.56ms ? ?/sec
build_schedule/100_schedule 1.09 6.3±0.16ms ? ?/sec 1.00 5.8±0.18ms ? ?/sec
build_schedule/100_schedule_no_constraints 1.07 822.0±41.62µs ? ?/sec 1.00 765.5±36.96µs ? ?/sec
build_schedule/500_schedule 1.06 178.7±2.42ms ? ?/sec 1.00 168.7±2.45ms ? ?/sec
ecs::entity_cloning::hierarchy_many/clone 1.00 223.0±21.79µs 1594.0 KElem/sec 1.06 236.7±22.48µs 1501.8 KElem/sec
empty_commands/0_entities 1.07 3.9±0.06ns ? ?/sec 1.00 3.6±0.06ns ? ?/sec
entity_hash/entity_set_lookup_hit/1000 1.00 1272.9±19.85ns 749.2 MElem/sec 1.06 1353.8±20.44ns 704.5 MElem/sec
entity_hash/entity_set_lookup_hit/10000 1.00 15.7±0.71µs 609.2 MElem/sec 1.13 17.6±0.76µs 541.1 MElem/sec
entity_hash/entity_set_lookup_hit/3162 1.00 4.5±0.09µs 677.1 MElem/sec 1.05 4.7±0.07µs 643.0 MElem/sec
entity_hash/entity_set_lookup_miss_gen/3162 1.27 13.5±5.21µs 222.7 MElem/sec 1.00 10.7±1.97µs 282.6 MElem/sec
entity_hash/entity_set_lookup_miss_id/100 1.00 150.4±1.37ns 634.2 MElem/sec 1.13 170.4±1.90ns 559.8 MElem/sec
entity_hash/entity_set_lookup_miss_id/1000 1.00 1369.9±16.37ns 696.1 MElem/sec 1.13 1548.1±26.61ns 616.0 MElem/sec
entity_hash/entity_set_lookup_miss_id/316 1.00 575.9±7.21ns 523.3 MElem/sec 1.10 632.6±13.41ns 476.4 MElem/sec
event_propagation/four_event_types 1.00 580.2±8.61µs ? ?/sec 1.20 699.0±7.06µs ? ?/sec
event_propagation/single_event_type 1.00 815.4±24.29µs ? ?/sec 1.12 915.7±17.27µs ? ?/sec
event_propagation/single_event_type_no_listeners 1.00 254.9±7.32µs ? ?/sec 1.10 280.6±4.82µs ? ?/sec
fake_commands/10000_commands 1.10 66.7±1.16µs ? ?/sec 1.00 60.8±1.06µs ? ?/sec
fake_commands/1000_commands 1.10 6.7±0.09µs ? ?/sec 1.00 6.1±0.09µs ? ?/sec
fake_commands/100_commands 1.09 679.4±8.37ns ? ?/sec 1.00 620.9±18.31ns ? ?/sec
few_changed_detection/50000_entities_ecs::change_detection::Sparse 1.00 66.4±2.63µs ? ?/sec 1.19 79.2±1.18µs ? ?/sec
few_changed_detection/5000_entities_ecs::change_detection::Sparse 1.00 4.0±0.13µs ? ?/sec 1.33 5.3±0.56µs ? ?/sec
insert_commands/insert 1.06 411.5±34.43µs ? ?/sec 1.00 386.4±13.09µs ? ?/sec
insert_simple/unbatched 1.11 1093.4±91.67µs ? ?/sec 1.00 985.8±201.49µs ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_100_entities_ecs::change_detection::Sparse 1.00 1652.9±44.05ns ? ?/sec 1.06 1754.1±54.74ns ? ?/sec
multiple_archetypes_none_changed_detection/20_archetypes_100_entities_ecs::change_detection::Table 1.00 1416.1±20.68ns ? ?/sec 1.06 1498.7±85.63ns ? ?/sec
observe/trigger_simple 1.00 349.3±6.29µs ? ?/sec 1.05 368.2±5.56µs ? ?/sec
observe/trigger_targets_simple/10000_entity 1.09 835.4±105.89µs ? ?/sec 1.00 767.5±38.11µs ? ?/sec
query_get/50000_entities_table 1.48 200.9±2.66µs ? ?/sec 1.00 135.8±1.96µs ? ?/sec
query_get_many_10/50000_calls_sparse 1.00 1766.7±100.45µs ? ?/sec 1.09 1920.5±113.28µs ? ?/sec
query_get_many_10/50000_calls_table 1.00 1468.8±126.40µs ? ?/sec 1.11 1633.9±76.52µs ? ?/sec
query_get_many_2/50000_calls_sparse 1.00 239.4±5.82µs ? ?/sec 1.37 326.8±3.97µs ? ?/sec
query_get_many_2/50000_calls_table 1.00 242.4±6.75µs ? ?/sec 1.26 305.5±3.88µs ? ?/sec
run_condition/no/1000_systems 1.06 26.7±0.61µs ? ?/sec 1.00 25.2±0.52µs ? ?/sec
run_condition/yes_using_query/10_systems 1.18 12.0±1.04µs ? ?/sec 1.00 10.2±0.70µs ? ?/sec
sized_commands_0_bytes/10000_commands 1.12 57.5±0.85µs ? ?/sec 1.00 51.5±1.03µs ? ?/sec
sized_commands_0_bytes/1000_commands 1.11 5.8±0.08µs ? ?/sec 1.00 5.2±0.07µs ? ?/sec
sized_commands_0_bytes/100_commands 1.11 583.4±7.94ns ? ?/sec 1.00 525.7±7.64ns ? ?/sec
sized_commands_12_bytes/10000_commands 1.10 68.1±4.66µs ? ?/sec 1.00 62.1±0.78µs ? ?/sec
sized_commands_12_bytes/1000_commands 1.11 6.2±0.12µs ? ?/sec 1.00 5.6±0.10µs ? ?/sec
sized_commands_12_bytes/100_commands 1.10 622.1±9.15ns ? ?/sec 1.00 567.9±9.35ns ? ?/sec
spawn_commands/1000_entities 1.00 70.3±8.53µs ? ?/sec 1.07 75.4±17.16µs ? ?/sec
spawn_commands/100_entities 1.00 7.0±0.20µs ? ?/sec 1.08 7.5±1.25µs ? ?/sec
world_entity/50000_entities 1.19 93.0±1.53µs ? ?/sec 1.00 78.0±1.01µs ? ?/sec
world_get/50000_entities_sparse 1.11 163.9±1.87µs ? ?/sec 1.00 147.9±1.44µs ? ?/sec
world_get/50000_entities_table 1.00 164.9±1.83µs ? ?/sec 1.08 177.6±2.08µs ? ?/sec Frankly, I think most of these results are just noise. This PR is faster than main by a lot in a few places, even though it just adds the bit not. So, I don't actually think there is any difference big enough to break through the noise that happens between runs. |
I agree with this assessment. The biggest difference i see is |
Does anyone else want to run the benches? I don't think it's a big deal, but maybe a different CPU would be insightful. But it shouldn't be a blocker IMO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor feedback. I haven't run benchmarks myself so I can't comment on possible performance degradation. My biggest concern is the change to Entity
ordering, which I think should be reverted via a custom implementation of PartialOrd
and Ord
on EntityRow
.
/// This can be used over [`Entity`] to improve performance in some cases, | ||
/// but improper use can cause this to identify a different entity than intended. | ||
/// Use with caution. | ||
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Debug, Display)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the derived implementations PartialOrd
and Ord
are the source of the inversion of Entity
ordering, I'd rather we custom-implement those traits so that we can preserve ordering. It'll be easier to address a possible performance regression due to that extra step than to diagnose and fix some degenerate behavioral change. It also reduces the number of diffs required for this PR, which is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs more discussion I think.
Item | Keeping as is (which changes ordering) | Custom Ord impl (to keep ordering) |
---|---|---|
needs test changes | yes | no |
potential perf regression | no | yes (part of why this is so fast is the repr(u64). A custom impl might double this cost) |
continuity | Entities share sorting with their bit representation | Entity and bit ordering are different |
intuition | Sorted by generation, then more recent spawns | Sorted by generation, then older entities |
Personally, I think keeping as is (changing ordering since 0.16) is better, but this needs more thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, we could keep the current bit representation too, with more bit math, but IDK if that's worth it. It's still a breaking change since 0 becomes invalid.
assert!(collection.iter().eq(&[b, c, d])); | ||
assert!(collection.iter().eq(&[d, c, b])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above re: custom implementation of PartialOrd
. I see this kind of change causing some difficult to diagnose bug somewhere.
@@ -856,7 +857,9 @@ mod tests { | |||
let mut table = TableBuilder::with_capacity(0, columns.len()) | |||
.add_column(components.get_info(component_id).unwrap()) | |||
.build(); | |||
let entities = (0..200).map(Entity::from_raw).collect::<Vec<_>>(); | |||
let entities = (0..200) | |||
.map(|index| Entity::from_raw(EntityRow::new(NonMaxU32::new(index).unwrap()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.map(|index| Entity::from_raw(EntityRow::new(NonMaxU32::new(index).unwrap()))) | |
.filter_map(Entity::fresh_from_row) |
Since we know 0..200
will only produce valid Entity
's, filter_map
is equivalent to unwrapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but this "we know" is kinda the thing tests are meant to verify. I put unwraps everywhere in tests for that reason. Even though the test isn't strictly about that, and it's trivial to verify (as of now). I don't have a strong opinion though. If consensus is different, I'm happy to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could meet in the middle and do:
let entities = (0..200)
.map(Entity::fresh_from_row)
.map(Option::unwrap)
.collect::<Vec<_>>();
That's more subjective tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I'm indifferent really. Whatever consensus is is fine with me.
Co-authored-by: Zachary Harrold <zac@harrold.com.au>
|
Co-authored-by: atlv <email@atlasdostal.com>
# Objective This is a followup to #18704 . There's lots more followup work, but this is the minimum to unblock #18670, etc. This direction has been given the green light by Alice [here](#18704 (comment)). ## Solution I could have split this over multiple PRs, but I figured skipping straight here would be easiest for everyone and would unblock things the quickest. This removes the now no longer needed `identifier` module and makes `Entity::generation` go from `NonZeroU32` to `struct EntityGeneration(u32)`. ## Testing CI --------- Co-authored-by: Mark Nokalt <marknokalt@live.com>
# Objective There are two problems this aims to solve. First, `Entity::index` is currently a `u32`. That means there are `u32::MAX + 1` possible entities. Not only is that awkward, but it also make `Entity` allocation more difficult. I discovered this while working on remote entity reservation, but even on main, `Entities` doesn't handle the `u32::MAX + 1` entity very well. It can not be batch reserved because that iterator uses exclusive ranges, which has a maximum upper bound of `u32::MAX - 1`. In other words, having `u32::MAX` as a valid index can be thought of as a bug right now. We either need to make that invalid (this PR), which makes Entity allocation cleaner and makes remote reservation easier (because the length only needs to be u32 instead of u64, which, in atomics is a big deal), or we need to take another pass at `Entities` to make it handle the `u32::MAX` index properly. Second, `TableRow`, `ArchetypeRow` and `EntityIndex` (a type alias for u32) all have `u32` as the underlying type. That means using these as the index type in a `SparseSet` uses 64 bits for the sparse list because it stores `Option<IndexType>`. By using `NonMaxU32` here, we cut the memory of that list in half. To my knowledge, `EntityIndex` is the only thing that would really benefit from this niche. `TableRow` and `ArchetypeRow` I think are not stored in an `Option` in bulk. But if they ever are, this would help. Additionally this ensures `TableRow::INVALID` and `ArchetypeRow::INVALID` never conflict with an actual row, which in a nice bonus. As a related note, if we do components as entities where `ComponentId` becomes `Entity`, the the `SparseSet<ComponentId>` will see a similar memory improvement too. ## Solution Create a new type `EntityRow` that wraps `NonMaxU32`, similar to `TableRow` and `ArchetypeRow`. Change `Entity::index` to this type. ## Downsides `NonMax` is implemented as a `NonZero` with a binary inversion. That means accessing and storing the value takes one more instruction. I don't think that's a big deal, but it's worth mentioning. As a consequence, `to_bits` uses `transmute` to skip the inversion which keeps it a nop. But that also means that ordering has now flipped. In other words, higher indices are considered less than lower indices. I don't think that's a problem, but it's also worth mentioning. ## Alternatives We could keep the index as a u32 type and just document that `u32::MAX` is invalid, modifying `Entities` to ensure it never gets handed out. (But that's not enforced by the type system.) We could still take advantage of the niche here in `ComponentSparseSet`. We'd just need some unsafe manual conversions, which is probably fine, but opens up the possibility for correctness problems later. We could change `Entities` to fully support the `u32::MAX` index. (But that makes `Entities` more complex and potentially slightly slower.) ## Testing - CI - A few tests were changed because they depend on different ordering and `to_bits` values. ## Future Work - It might be worth removing the niche on `Entity::generation` since there is now a different niche. - We could move `Entity::generation` into it's own type too for clarity. - We should change `ComponentSparseSet` to take advantage of the new niche. (This PR doesn't change that yet.) - Consider removing or updating `Identifier`. This is only used for `Entity`, so it might be worth combining since `Entity` is now more unique. --------- Co-authored-by: atlv <email@atlasdostal.com> Co-authored-by: Zachary Harrold <zac@harrold.com.au>
…9121) # Objective This is a followup to bevyengine#18704 . There's lots more followup work, but this is the minimum to unblock bevyengine#18670, etc. This direction has been given the green light by Alice [here](bevyengine#18704 (comment)). ## Solution I could have split this over multiple PRs, but I figured skipping straight here would be easiest for everyone and would unblock things the quickest. This removes the now no longer needed `identifier` module and makes `Entity::generation` go from `NonZeroU32` to `struct EntityGeneration(u32)`. ## Testing CI --------- Co-authored-by: Mark Nokalt <marknokalt@live.com>
# Objective Since #18704 is done, we can track the length of unique entity row collections with only a `u32` and identify an index within that collection with only a `NonMaxU32`. This leaves an opportunity for performance improvements. ## Solution - Use `EntityRow` in sparse sets. - Change table, entity, and query lengths to be `u32` instead of `usize`. - Keep `batching` module `usize` based since that is reused for events, which may exceed `u32::MAX`. - Change according `Range<usize>` to `Range<u32>`. This is more efficient and helps justify safety. - Change `ArchetypeRow` and `TableRow` to wrap `NonMaxU32` instead of `u32`. Justifying `NonMaxU32::new_unchecked` everywhere is predicated on this safety comment in `Entities::set`: "`location` must be valid for the entity at `index` or immediately made valid afterwards before handing control to unknown code." This ensures no entity is in two table rows for example. That fact is used to argue uniqueness of the entity rows in each table, archetype, sparse set, query, etc. So if there's no duplicates, and a maximum total entities of `u32::MAX` none of the corresponding row ids / indexes can exceed `NonMaxU32`. ## Testing CI --------- Co-authored-by: Christian Hughes <9044780+ItsDoot@users.noreply.github.com>
Objective
There are two problems this aims to solve.
First,
Entity::index
is currently au32
. That means there areu32::MAX + 1
possible entities. Not only is that awkward, but it also makeEntity
allocation more difficult. I discovered this while working on remote entity reservation, but even on main,Entities
doesn't handle theu32::MAX + 1
entity very well. It can not be batch reserved because that iterator uses exclusive ranges, which has a maximum upper bound ofu32::MAX - 1
. In other words, havingu32::MAX
as a valid index can be thought of as a bug right now. We either need to make that invalid (this PR), which makes Entity allocation cleaner and makes remote reservation easier (because the length only needs to be u32 instead of u64, which, in atomics is a big deal), or we need to take another pass atEntities
to make it handle theu32::MAX
index properly.Second,
TableRow
,ArchetypeRow
andEntityIndex
(a type alias for u32) all haveu32
as the underlying type. That means using these as the index type in aSparseSet
uses 64 bits for the sparse list because it storesOption<IndexType>
. By usingNonMaxU32
here, we cut the memory of that list in half. To my knowledge,EntityIndex
is the only thing that would really benefit from this niche.TableRow
andArchetypeRow
I think are not stored in anOption
in bulk. But if they ever are, this would help. Additionally this ensuresTableRow::INVALID
andArchetypeRow::INVALID
never conflict with an actual row, which in a nice bonus.As a related note, if we do components as entities where
ComponentId
becomesEntity
, the theSparseSet<ComponentId>
will see a similar memory improvement too.Solution
Create a new type
EntityRow
that wrapsNonMaxU32
, similar toTableRow
andArchetypeRow
.Change
Entity::index
to this type.Downsides
NonMax
is implemented as aNonZero
with a binary inversion. That means accessing and storing the value takes one more instruction. I don't think that's a big deal, but it's worth mentioning.As a consequence,
to_bits
usestransmute
to skip the inversion which keeps it a nop. But that also means that ordering has now flipped. In other words, higher indices are considered less than lower indices. I don't think that's a problem, but it's also worth mentioning.Alternatives
We could keep the index as a u32 type and just document that
u32::MAX
is invalid, modifyingEntities
to ensure it never gets handed out. (But that's not enforced by the type system.) We could still take advantage of the niche here inComponentSparseSet
. We'd just need some unsafe manual conversions, which is probably fine, but opens up the possibility for correctness problems later.We could change
Entities
to fully support theu32::MAX
index. (But that makesEntities
more complex and potentially slightly slower.)Testing
to_bits
values.Future Work
Entity::generation
since there is now a different niche.Entity::generation
into it's own type too for clarity.ComponentSparseSet
to take advantage of the new niche. (This PR doesn't change that yet.)Identifier
. This is only used forEntity
, so it might be worth combining sinceEntity
is now more unique.