Skip to content

Remote entity reservation v9 #18670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 118 commits into
base: main
Choose a base branch
from

Conversation

ElliottjPierce
Copy link
Contributor

@ElliottjPierce ElliottjPierce commented Apr 1, 2025

fixes #18003

Objective

It's version 9 of the same objective lol. For assets as entities, we need entities to be able to be reserved from any thread. Ideally, this can be done without depending on an async context, blocking, or waiting. Any of these compromises could hurt asset performance or discontinue the completely non-blocking nature of the asset system.

As a bonus, this PR makes allocating entities only need &Entities instead of &mut. Entities::flush is now completely optional, meaning none of the Entities methods depends on the flush at all, and there is protection against flushing an entity twice.

(If you're curious, v9 actually branched from v8. v8 was focused on #18577 (never flush entities), but this still includes flushing.)

There's a doc here that builds some background for this too. If you haven't been following this I'd highly recommend reading that before diving into the code.

Solution

Organizationally, I split off the underlying EntityAllocator from Entities. This makes it easier to read, etc, now that it's more involved.

The basic problem is that we need to be able to allocate an entity from any thread at any time. We also need to be able to free an entity. So at the allocator level, that's 3 operations: free, alloc (For when you know free isn't being called), and remote_alloc (can be called any time). None of these can require mutable access.

The biggest challenge is having a list of entities that are free and waiting to be re-used. The list needs to be fully functional without mutable access, needs to be resizable, and needs to be pinned in memory. I ended up using a strategy similar to SplitVec. That dependency requires std, and knowing the max capacity ahead of time lets us simplify the implementation, so I made my own implementation here.

Testing

No new tests right now. It might be worth using loom at some point, but that requires an additional dependency, test-specific loom feature flags, and giving this treatment to multiple crates, especially bevy_platform.

Future work

#18577 is still a good end game here IMO. Ultimately, (just like @maniwani said would happen), I decided that doing this all at once would be both too challenging and add too much complexity. However, v9 makes "never flush" much, much more approachable for the future. The biggest issues I ran into were that lots of places hold a reference to an entity's Archetype (but the entity now might not have an archetype) and that checking archetypes everywhere might actually be less performant than flushing. Maybe.

We can also potentially speed up a lot of different processes now that alloc can be called without mutable access and free (etc.) can be called without needing to flush first.

Costs

Benchmarks
group                                           main_baseline                           remote_reservation_v9_baseline
-----                                           -------------                           ------------------------------
add_remove/sparse_set                           1.06   625.2±42.08µs        ? ?/sec     1.00   591.8±24.97µs        ? ?/sec
add_remove/table                                1.00   883.9±66.56µs        ? ?/sec     1.06   941.1±34.62µs        ? ?/sec
add_remove_very_big/table                       1.08     37.7±2.55ms        ? ?/sec     1.00     34.9±0.76ms        ? ?/sec
added_archetypes/archetype_count/1000           1.13  688.8±175.58µs        ? ?/sec     1.00  608.6±137.61µs        ? ?/sec
added_archetypes/archetype_count/200            1.13    72.7±18.97µs        ? ?/sec     1.00    64.2±20.78µs        ? ?/sec
added_archetypes/archetype_count/2000           1.11  1086.7±290.33µs        ? ?/sec    1.00  977.2±118.15µs        ? ?/sec
added_archetypes/archetype_count/5000           1.09      2.7±0.29ms        ? ?/sec     1.00      2.5±0.27ms        ? ?/sec
despawn_world/10_entities                       1.00   695.6±13.85ns        ? ?/sec     1.09   760.4±40.01ns        ? ?/sec
despawn_world/1_entities                        1.00   182.0±24.14ns        ? ?/sec     1.56   284.3±50.46ns        ? ?/sec
despawn_world_recursive/10000_entities          1.00  1668.8±95.30µs        ? ?/sec     1.13  1878.0±111.75µs        ? ?/sec
despawn_world_recursive/10_entities             1.00      2.3±0.04µs        ? ?/sec     1.05      2.4±0.09µs        ? ?/sec
despawn_world_recursive/1_entities              1.00   382.2±36.07ns        ? ?/sec     1.41   539.1±60.46ns        ? ?/sec
empty_archetypes/iter/10000                     1.07     12.8±1.61µs        ? ?/sec     1.00     12.0±0.49µs        ? ?/sec
empty_archetypes/par_for_each/100               1.06      9.2±1.04µs        ? ?/sec     1.00      8.7±0.35µs        ? ?/sec
empty_archetypes/par_for_each/1000              1.12     12.8±0.87µs        ? ?/sec     1.00     11.5±0.36µs        ? ?/sec
empty_archetypes/par_for_each/10000             1.19     25.4±0.98µs        ? ?/sec     1.00     21.3±0.41µs        ? ?/sec
empty_archetypes/par_for_each/2000              1.16     13.6±1.17µs        ? ?/sec     1.00     11.7±0.44µs        ? ?/sec
empty_archetypes/par_for_each/500               1.08     11.1±0.70µs        ? ?/sec     1.00     10.3±0.28µs        ? ?/sec
empty_commands/0_entities                       1.00      3.9±0.06ns        ? ?/sec     1.40      5.4±0.06ns        ? ?/sec
entity_hash/entity_set_lookup_miss_gen/10000    1.00     41.6±6.26µs 229.0 MElem/sec    1.05     43.9±5.74µs 217.1 MElem/sec
entity_hash/entity_set_lookup_miss_id/10000     1.25     44.5±5.30µs 214.2 MElem/sec    1.00     35.7±5.03µs 266.8 MElem/sec
event_propagation/four_event_types              1.13   606.5±27.06µs        ? ?/sec     1.00    535.9±5.38µs        ? ?/sec
event_propagation/single_event_type             1.12   870.3±27.20µs        ? ?/sec     1.00   776.4±18.86µs        ? ?/sec
fake_commands/2000_commands                     1.00     12.1±0.08µs        ? ?/sec     1.28     15.4±0.25µs        ? ?/sec
fake_commands/4000_commands                     1.00     24.2±0.26µs        ? ?/sec     1.28     30.9±0.38µs        ? ?/sec
fake_commands/6000_commands                     1.00     36.3±0.50µs        ? ?/sec     1.27     46.2±0.48µs        ? ?/sec
fake_commands/8000_commands                     1.00     48.3±0.15µs        ? ?/sec     1.28     61.6±0.87µs        ? ?/sec
insert_simple/base                              1.29   403.9±79.33µs        ? ?/sec     1.00   312.1±56.57µs        ? ?/sec
insert_simple/unbatched                         2.41  1021.5±234.06µs        ? ?/sec    1.00   423.2±17.00µs        ? ?/sec
iter_fragmented/base                            1.00    346.6±8.58ns        ? ?/sec     1.40    485.0±9.30ns        ? ?/sec
iter_fragmented/foreach                         1.07    141.4±6.76ns        ? ?/sec     1.00    132.4±3.85ns        ? ?/sec
iter_fragmented_sparse/base                     1.19      7.9±0.16ns        ? ?/sec     1.00      6.6±0.09ns        ? ?/sec
iter_simple/foreach_wide                        2.76     46.4±0.46µs        ? ?/sec     1.00     16.8±0.18µs        ? ?/sec
iter_simple/foreach_wide_sparse_set             1.00    80.9±13.81µs        ? ?/sec     1.12    90.3±29.10µs        ? ?/sec
observe/trigger_simple                          1.00    450.3±9.25µs        ? ?/sec     1.08    488.5±9.99µs        ? ?/sec
query_get/50000_entities_table                  1.00    138.8±0.67µs        ? ?/sec     1.05    145.9±1.23µs        ? ?/sec
query_get_many_5/50000_calls_sparse             1.06   607.3±14.47µs        ? ?/sec     1.00   570.5±42.57µs        ? ?/sec
sized_commands_0_bytes/2000_commands            1.00     10.6±1.20µs        ? ?/sec     1.26     13.3±0.18µs        ? ?/sec
sized_commands_0_bytes/4000_commands            1.00     20.6±0.33µs        ? ?/sec     1.29     26.5±0.42µs        ? ?/sec
sized_commands_0_bytes/6000_commands            1.00     30.8±0.26µs        ? ?/sec     1.29     39.9±0.61µs        ? ?/sec
sized_commands_0_bytes/8000_commands            1.00     41.4±0.77µs        ? ?/sec     1.28     53.1±0.65µs        ? ?/sec
sized_commands_12_bytes/2000_commands           1.00     11.6±0.29µs        ? ?/sec     1.23     14.2±0.32µs        ? ?/sec
sized_commands_12_bytes/4000_commands           1.00     22.8±3.30µs        ? ?/sec     1.24     28.3±0.30µs        ? ?/sec
sized_commands_12_bytes/6000_commands           1.00     33.6±0.20µs        ? ?/sec     1.27     42.8±0.63µs        ? ?/sec
sized_commands_12_bytes/8000_commands           1.00     48.8±6.10µs        ? ?/sec     1.22     59.7±0.76µs        ? ?/sec
sized_commands_512_bytes/2000_commands          1.00     46.3±1.28µs        ? ?/sec     1.06     48.9±1.73µs        ? ?/sec
sized_commands_512_bytes/4000_commands          1.00     90.5±2.06µs        ? ?/sec     1.08     97.4±3.33µs        ? ?/sec
spawn_commands/2000_entities                    1.00   155.4±11.61µs        ? ?/sec     1.22   189.5±15.70µs        ? ?/sec
spawn_commands/4000_entities                    1.00   303.7±15.69µs        ? ?/sec     1.22   371.1±18.58µs        ? ?/sec
spawn_commands/6000_entities                    1.00   463.3±31.95µs        ? ?/sec     1.20   554.8±12.55µs        ? ?/sec
spawn_commands/8000_entities                    1.00   619.7±44.57µs        ? ?/sec     1.19   734.9±16.71µs        ? ?/sec
spawn_world/1000_entities                       1.06     41.5±2.97µs        ? ?/sec     1.00     39.1±2.94µs        ? ?/sec
spawn_world/100_entities                        1.20      4.8±2.29µs        ? ?/sec     1.00      4.0±0.69µs        ? ?/sec
spawn_world/10_entities                         1.15   461.9±80.36ns        ? ?/sec     1.00   400.5±26.17ns        ? ?/sec
spawn_world/1_entities                          1.05     41.4±6.35ns        ? ?/sec     1.00     39.4±3.02ns        ? ?/sec
world_get/50000_entities_sparse                 1.00    167.1±4.23µs        ? ?/sec     1.07    178.8±2.44µs        ? ?/sec
world_query_get/50000_entities_sparse_wide      1.00    125.0±0.31µs        ? ?/sec     1.08   135.7±78.30µs        ? ?/sec
world_query_iter/50000_entities_sparse          1.00     38.7±0.08µs        ? ?/sec     1.18     45.6±0.60µs        ? ?/sec

Interpreting benches:

In most places, v9 is on par with or even faster than main. Some notable exceptions are the "sized_commands" and "fake_commands" sections, but the regression there is purely due to Entities::flush being slower, but we make up for that elsewhere. These commands don't actually do anything though, so this is not relevant to actual use cases. The benchmarks just exist to stress test CommandQueue.

The only place where v9 causes a significant and real-world applicable regression is "spawn_commands", where v9 is roughly 15% slower than main. This is something that can be changed later now that alloc doesn't need mutable access. I expect we can change this 15% regression to a 15% improvement given that "spawn_world" is roughly 20% faster on v9 than on main. For users that need really fast spawn commands though, they are already using some form of batch spawning or direct world access.

Other regressions seem to be either minimal, unrealistic, easily corrected in the future, or wrong. I feel confident saying "wrong" since running them back to back can sometimes yield different results. I'm on a M2 Max, so there might be some things jumping from perf cores to efficiency cores or something. (I look forward to standardized benchmarking hardware.)

Wins: I was worried that, without "never flush", this would be an overall regression, but I am relived that that is not the case. Some very common operations, "insert_simple/unbatched" for example, are way faster on this branch than on main. Basically, on main, alloc also adds EntityMeta for the entity immediately, but on this branch, we only do so in set. That seems to improve temporal cache locality and leads to this roughly 220% improvement. "added_arhcetype" sees 20%-80% improvements too, etc. "iter_simple/foreach_wide" also sees a 270% improvement.

I think in practice, v9 will out perform main for real-world schedules. And I think moving towards "never flush" (even for only a few operations, like Commands::spawn) will improve performance even more.

Co-authored-by: atlv <email@atlasdostal.com>
let next_index = self.indices.next()?;
let (chunk, index, chunk_capacity) = self.buffer.index_in_chunk(next_index);

// SAFETY: Assured by constructor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FreeBufferIterator has no constructor/new function, so it might be helpful to add one or change this to reference FreeBuffer::iter. Same for the reference to constructor below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean as in a new function; I just meant whoever constructs it followed the safety comment on the type docs. I think that's a common pattern in the ecs. I could move it to a new function, but someone could miss the safety comment then by manually constructing it. But if you have a better idea here, I'm all ears! I don't like type level safety comments either.

ElliottjPierce and others added 2 commits May 10, 2025 22:21
Co-authored-by: Christian Hughes <9044780+ItsDoot@users.noreply.github.com>
Copy link

@Atlas16A Atlas16A left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, been reviewed enough I'm confident its more or less ready.

@alice-i-cecile alice-i-cecile added S-Needs-SME Decision or review from an SME is required and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels May 14, 2025
@alice-i-cecile alice-i-cecile added this to the 0.17 milestone May 14, 2025
Copy link
Contributor

@maniwani maniwani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spot any glaring issues, just some minor nits, but I'm not yet confident I grok the exact mechanisms that enforce the safety / transactional guarantees between free and remote_alloc.

That seems like a crucial part of this PR, so some of my comments are just questions about it.

Comment on lines 314 to 332
fn next(&mut self) -> Option<Self::Item> {
if let Some(found) = self.current.next() {
return Some(found.get_entity());
}

let still_need = self.indices.len() as u32;
let next_index = self.indices.next()?;
let (chunk, index, chunk_capacity) = self.buffer.index_in_chunk(next_index);

// SAFETY: Assured by constructor
let slice = unsafe { chunk.get_slice(index, still_need, chunk_capacity) };
self.indices.start += slice.len() as u32;
self.current = slice.iter();

// SAFETY: Constructor ensures these indices are valid in the buffer; the buffer is not sparse, and we just got the next slice.
// So the only way for the slice to be empty is if the constructor did not uphold safety.
let next = unsafe { self.current.next().debug_checked_unwrap() };
Some(next.get_entity())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit

Would you be willing to add some more comments and/or use more descriptive field and variable names here, so it's a bit more obvious what's going on? It wasn't immediately clear to me that the range tracks the span of slots that remain to be iterated beyond the current chunk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought. I renamed the struct fields and gave them doc comments. If you see something else though, I'm open to other renames too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Comment on lines +548 to +555
// The goal is the same as `alloc`, so what's the difference?
// `alloc` knows `free` is not being called, but this does not.
// What if we `len.fetch_sub(1)` but then `free` overwrites the entity before we could read it?
// That would mean we would leak an entity and give another entity out twice.
// We get around this by only updating `len` after the read is complete.
// But that means something else could be trying to allocate the same index!
// So we need a `len.compare_exchange` loop to ensure the index is unique.
// Because we keep a generation value in the `FreeCount`, if any of these things happen, we simply try again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to check my understanding.

  • The free buffer's chunks cannot move once allocated.
  • free has a critical section guarded by acquire and release semantics.

So is the idea that remote_alloc is safe because remote_alloc will either see the disabled flag immediately or its compare-exchange will fail when it sees the change (e.g. change in disabled flag or generation value)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. If it fails, spin and try again. Never interrupt another action from remote_alloc; just wait for it to finish.

Comment on lines +393 to +394
// Also subtract one from the generation bit.
subtract_length | Self::GENERATION_LEAST_BIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to picture how this counter state evolves over time. Is the generation counting down or counting up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

It doesn't matter if it goes up or down as long as it's consistent. Here's an example:

  • Fresh allocator: generation=0, enabled, len=0
  • Try to pop/alloc; We subtract 1 from length and (to keep it one op) 1 from generation: generation (which wrapped) = Some big number N, enabled, len=-1, which presents as 0 still since length is not really negative.
  • Push/free; We load the state and add 1 to the presented length (0 even though -1 is stored; length can't be negative) and don't touch the generation. Then, set the state. The state is changing anyway, and the only way to change back is to pop, which changes the generation. So we only need to change the len: generation=N, enabled, len=1.
  • Pop/alloc; We subtract 1 from generation and length: generation=N-1, enabled, len=0.
  • Remote Pop/Remote Reserve: We acquire load the state, but the len is 0 so we return None.
  • Push; Same as before: generation=N-1, enabled, len=1.
  • Start remote pop A; acquire load the state, grab the right entity, prepare an ideal state of generation=N-2, enabled, len=0, but we haven't stored it yet.
  • Now a free starts; load the state, disabling it: generation=N-1, disabled, len=1.
  • Start remote pop B; acquire load the state, see that it's disabled, and spin.
  • Remote pop A fails it's compare exchange (because the disabled bit is on), and spins.
  • The free finishes as before: generation=N-1, enabled, len=2
  • Remote pop A tries again, making a new ideal state, and it's compare exchange succeeds: generation=N-2, enabled, len=1.
  • Remote pop B tries again, making an ideal state of generation=N-3, enabled, len=0, but doesn't store it yet.
  • Local pop; generation=N-3, enabled, len=0.
  • Remote pop B compare exchange fails, tries again but sees a 0 len, and returns none.

I know that's a complex example, but hopefully that demonstrates how the state might change over time in some of the edge cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I understand now. Pushing increments the length and popping decrements both the length and generation, so every operation results in a unique state.

There's no ABA problem because once we push to the free list, we can't return to the same length without advancing the generation.

Relaxed actually might not be fine here in some situations. Aquire and Release are used everywhere for the state anyway.
andrewzhurov pushed a commit to andrewzhurov/bevy that referenced this pull request May 17, 2025
…9121)

# Objective

This is a followup to bevyengine#18704 . There's lots more followup work, but this
is the minimum to unblock bevyengine#18670, etc.

This direction has been given the green light by Alice
[here](bevyengine#18704 (comment)).

## Solution

I could have split this over multiple PRs, but I figured skipping
straight here would be easiest for everyone and would unblock things the
quickest.

This removes the now no longer needed `identifier` module and makes
`Entity::generation` go from `NonZeroU32` to `struct
EntityGeneration(u32)`.

## Testing

CI

---------

Co-authored-by: Mark Nokalt <marknokalt@live.com>
@maniwani
Copy link
Contributor

OK, I understand the high-level design now.

The safety invariants seem fine.

  • The allocator/freelist is chunked so that the chunks have stable memory locations and an Arc<SharedAllocator> is used so that those locations remain valid until all Arc clones are dropped.
  • Callsites uphold the invariant that there are no races between alloc and free, and remote_alloc can't race because it will only commit a change to the freelist if uninterrupted (and not currently disabled by a free).

This looks good to me, but I just wanted to ask one more question for clarification—

An alternative to this was having the world poll a queue for remote allocation requests during each flush (v6). Are the reasons to prefer this PR that it performs better than v6 and that it, unlike v6, never forces remote threads to block?

@ElliottjPierce
Copy link
Contributor Author

An alternative to this was having the world poll a queue for remote allocation requests during each flush (v6). Are the reasons to prefer this PR that it performs better than v6 and that it, unlike v6, never forces remote threads to block?

Yeah, pretty much. In my experimentation, I found pretty much five ways to do remote reservation:

  1. Keep a concurrent remote "ready" list that remote reservations can pull from. Top it off on each flush. But, that means, we have a longer flush time, sacrifice N entities to always be in the queue, and have the potential to either block and await when the ready list runs dry or allocate a brand new index, which is effectively a memory leak. This is v6, and we block.
  2. Keep a 2-way channel when other threads request an entity, and those requests are fulfilled. We wouldn't need to fulfill them in flush, it could be another SubApp::update item. Trouble is, now remote reservations is very block/await heavy. It's not ideal that an asset loader might have to stop or delay reading form disk to await a new frame so it has an entity id. And if we fulfill it more than once per frame, now performance is worse. This was v5.
  3. Keep a Vec and an atomic len in an arc. When reallocating, make a new arc. Hold a RwLock of the most recent vec arc. When the list runs dry, or every once in a while, try to upgrade to a new vec arc. We can cache if there's a new one via an atomic flag. This is effectively a pinned vec, but harder. It does work though; this was v4.
  4. Fundamentally split the storage of remote and local entities to have different allocators for them, once counting 0 and up, the other u32::MAX and down. This was v1, and ultimately this is a no-go since splitting that storage could have a negative perf impact across the board.
  5. Keep a pinned vec in memory to hold a stack of free ids. Decouple alloc from spawning and reserve from flush. (This Pr, v9).

So this Pr is the best solution of the options I tried IMO. It has the best performance, doesn't block, lock, or await. (It may wait for a free to finish, but that's no more blocking than being in a CPU scheduler.) It improves some performance over main. It opens up a lot of room for future Improvements, like not flushing for command spawning, readonly alloc calls instead of the slightly more expensive reserve, and maybe in the future, never flushing! Also, because it decouples allocation from spawning, we can use this to unblock components as entities. (We don't need to figure out what happens when a component entity is despawned; we just never spawn it, even though we do allocate its Entity id.) This decoupling could be done to any remote reservation scheme, but it was only necessary for v9, and at the time, I didn't realize howe useful this might be.

As a history of what I've explored:

  • v1: split meta storage; too slow.
  • v2: separate pending lists for remote vs local; unsound.
  • v3: failed attempt to fix v2 but not practical; no pr opened.
  • v4: lazy replaced vec (instead of reallocating); v9 is just better.
  • v5: two way async remote requests; too slow and async heavy, and v6 is just better.
  • v6: one way remote requests (it's technically two ways, but only to inform how much to refill the buffer by); still a viable alternative, but not as fast and means a blocking asset system (kinda disapointing to loose one of the marketing points of our asset system...)
  • v7: Remove the whole reserve and flush scheme and ditch the empty archetype; Very ugly.
  • v8: Remove the whole reserve and flush scheme, but keep the empty archetype; very promising, but too much for one pr (no pr opened).
  • v9: Evolution of v8, leaving the rest of "never flush" as future work to evaluate on a case-by-case basis.

I'm not saying I've explored everything, but v9 is the best so far IMO. That said, if anyone has an idea I haven't tried, I'm all ears.

For reference flecs seems to be kinda similar to v9. It pages entities; v9 calls them chunks. I haven't looked too closely, and I'm sure there are differences, but I think this is the right track.

@cart cart moved this to Respond (With Priority) in @cart's attention May 20, 2025
@maniwani
Copy link
Contributor

I'm not saying I've explored everything, but v9 is the best so far IMO. That said, if anyone has an idea I haven't tried, I'm all ears.

That's alright. I was asking because I didn't follow all of your discussions and design work leading up to this PR, and I wanted to be sure I had the justification right before approving.

@maniwani
Copy link
Contributor

(My approval doesn't hinge on any of this stuff. I just wanted to address the other parts of your comment separately.)

Also, because it decouples allocation from spawning, we can use this to unblock components as entities. (We don't need to figure out what happens when a component entity is despawned; we just never spawn it, even though we do allocate its Entity id.)

Good to know, but IIRC components-as-entities had only been blocked on deciding whether or not to let !Send types be components (because we also want to store resources as components).

Correct me if I'm wrong but this seems only loosely related. Asset loaders don't have access to the world, so they can't initialize queries or systems (and that shouldn't change in the future because queries and systems will be entities that need to be instantiated).

In any case, handling component entities being despawned shouldn't be a tough problem. Bevy's ECS has immutable components and reactivity, so it'd be easy enough to create some IsComponent component that panics if removed the normal way. That's basically how flecs handles it.

Reflection can be significantly empowered (and simplified) if component registrations are normal entities and you can attach your own components to them, so I don't see it as something we want to avoid.

For reference flecs seems to be kinda similar to v9. It pages entities; v9 calls them chunks. I haven't looked too closely, and I'm sure there are differences, but I think this is the right track.

flecs' entity metadata is closer to how bevy currently is (last I checked anyway). Its equivalent of the meta vector is chunked to save memory and the stability also allows other internal structures to cache pointers to the metadata of specific entities.

(flecs has a make_alive API you can use to spawn a specific entity ID. If you spawn 100 entities and then make_alive index 10000, it'd be a waste to allocate memory for the whole index range in between.)

I don't know if flecs directly supports "remote" reservation of entities or if Sander even considers it necessary. (I don't recall it being a feature but if he's said otherwise then ignore me lol.)

@NthTensor
Copy link
Contributor

I found pretty much five ways to do remote reservation:

Thanks, this is an amazing writeup!

@ElliottjPierce
Copy link
Contributor Author

Good to know, but IIRC components-as-entities had only been blocked on deciding whether or not to let !Send types be components (because we also want to store resources as components).

Yeah; I don't mean that remote reservation is a prerequisite for components as entities; just that it lets us start reworking the ComponentId representation before we need to put any data on the spawned entity itself. That lets us plow forward with unblocking fragmenting components etc while we resolve other questions.

A few of those questions (that I still have anyway) are: 1) Is panicking the best way to handle a removed/despawned component entity? Why not yank the component from all archetypes and continue? 2) Can we make the resource singleton entity different from the component info singleton entity? I don't see why not and that would let us punt !Send for later. 3) What kind of data do we want to expose via components on the component entity? (These are all questions to discuss elsewhere, but my point is that this pr will let us start the id/fragmenting part of components as entities, while we work out these details.)

Correct me if I'm wrong but this seems only loosely related. Asset loaders don't have access to the world, so they can't initialize queries or systems (and that shouldn't change in the future because queries and systems will be entities that need to be instantiated).

Yeah, this is 100% a separate issue. I just want to do the id/fragmenting portion of components as entities soon. There's already prs for query by value, fragmenting components, etc that would really benefit from this.

In any case, handling component entities being despawned shouldn't be a tough problem. Bevy's ECS has immutable components and reactivity, so it'd be easy enough to create some IsComponent component that panics if removed the normal way. That's basically how flecs handles it.

That's a solid option. I don't mean to dig up old questions if we've already reached a consensus here. I just don't know what it is. Maybe worth laying out a plan explicitly somewhere. If one exists, I haven't been able to find it.

Reflection can be significantly empowered (and simplified) if component registrations are normal entities and you can attach your own components to them, so I don't see it as something we want to avoid.

Very much looking forward to this!

flecs' entity metadata is closer to how bevy currently is (last I checked anyway). Its equivalent of the meta vector is chunked to save memory and the stability also allows other internal structures to cache pointers to the metadata of specific entities.

Yeah, that makes sense. I only glanced at it, and yeah; doesn't seem to support remote reservation.

@maniwani
Copy link
Contributor

maniwani commented May 20, 2025

1) Is panicking the best way to handle a removed/despawned component entity? Why not yank the component from all archetypes and continue?

Eh, I mean, it's not like today's Bevy lets you unregister a component. Configurable cleanup is a feature itself, so I wouldn't tie them together.

2) Can we make the resource singleton entity different from the component info singleton entity? I don't see why not and that would let us punt !Send for later.

I think the matter of !Send data was settled by #18386. Bevy's own first-party plugins are no longer using !Send resources, so I think all that's left is to formally deprecate them (which I'm guessing would happen with or following #17485). We had been punting it for too long already lol.

3) What kind of data do we want to expose via components on the component entity?

I don't have a list, but low-hanging fruit would be components that represent "implements Clone" and such. Then you could write a query that finds all clonable components, all networked components, etc.

If we spawn entities to represent plugins and link them to their components with ChildOf, then you could find out which components come from which plugin.

Things like that.

@ElliottjPierce
Copy link
Contributor Author

For y'all who haven't been following this for a while, I want to draw attention to this comment from a while ago. It raises a potential performance concern about a benchmarking blindspot. My assessment is that it's fine, but I want this to be out in the open.

There's also a doc out here that gives some more background if you're interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide S-Needs-SME Decision or review from an SME is required X-Controversial There is active debate or serious implications around merging this PR
Projects
Status: Respond (With Priority)
Development

Successfully merging this pull request may close these issues.

Reserve entities from async