Remote entity reservation v7 #18611

ElliottjPierce · 2025-03-29T19:48:47Z

Objective

This is v7, so you get the idea. See also v6 (#18525). I'll fill this in more if we proceed here.

Solution

The full design plan is in #18577. A few things changed (actually for the better) as I wrote this.
Instead of accommodating remote reservation from the free list all the time, it is disable very briefly during some operations.
Namely, when freeing an entity or moving it out of the empty archetype, remote reservation will have to get a new brand new entity index. This is faster than the alternative. The only downside is that remote reservation may (very rarely) increase the total number of entities when one could be reused instead. But that would be rare, and it's better than blocking / waiting IMO.

Costs:

spawning an empty reusing freed list = 1 atomic op.
spawning an empty not reusing freed list = 2 atomic op.
remotely spawning an empty reusing freed list = 2 atomic op.
remotely spawning an empty not reusing freed list = 3 atomic op.
allocating a non-empty entity reusing freed list = 2 atomic op.
allocating a non-empty entity without reusing freed list = 3 atomic op.
moving archetype of reused entity from empty archetype = 1 atomic op if there's no pending entity, else 3 atomic ops.
moving an entity into the empty archetype = 2 atomic ops if there's nothing pending, else 4.
freeing a non-empty entity = 3 atomic ops

Future:

freeing an empty entity is currently moving its archetype and then freeing (6 ops average), but we can make a faster impl later.
Currently freeing, spawning, allocating (etc) can only do one entity at a time. On paper we can batch this. Ex: freeing 100 entities should in the long run be no more atomic ops than freeing 1.

Draft

This is ver much a draft to facilitate discussion. This does not interact with EntityMeta or the rest of the ECS yet.
Only the atomic aspects of the design have been built.

chescock · 2025-03-30T00:04:04Z

crates/bevy_ecs/src/entity/entity_allocation.rs

+        let target = head.add(index as usize);
+
+        // SAFETY: Ensured by caller.
+        unsafe { core::ptr::replace(target, Slot::new(entity)) }


For this to be sound in the Rust memory model for atomic accesses, you need to prove that any other access to this slot either happens-before this one, or this one happens-before the other.

You need to update the SAFETY comments to show that's the case, and I believe that will be impossible if all of the operations are Ordering::Relaxed.

I just updated safety comments because I do think this is safe. set is never called concurrently and is never called with an index that could be considered valid by a remote reservation. In practice, this could theoretically just be an UnsafeCell.

Given that there's no concurrency or conflicts I don't think this violates the memory access model. I can't prove which sets will be in what order, but I can prove that they will be in a definite order. The set calls will never conflict with other set or get ops. And get is ok to conflict with other get calls because they don't do any modification. So I think safety is upheld and relaxed ordering is fine.

Is that right? You know a lot more about this than me, so I trust your judgement.

The only place I'm ify on Relaxed is the storing and loading of OwnedBuffer::len, but that is never written concurrently, and we guard agains early reads by temporarily disabling the free cursor when we decrease len, and a premature read has no affect if we are increasing it (RemoteOwned might miss the chance to reuse a freshly flushed entity, but who cares?).

Given that there's no concurrency or conflicts I don't think this violates the memory access model.

The concurrency comes from calling RemoteOwned::try_spawn_empty_from_free on one thread while calling methods like Owned::free_non_empty_buffer on another. (I thought that was the goal here; if I misunderstood then I apologize!)

You aren't doing any synchronization, so the order of operations one one thread doesn't affect the other. The set call in free_non_empty_buffer could be reordered after the final update to free_cursor, such that try_spawn_empty_from_free reads the new value of free_cursor but get returns an old value at that index. And again, it's worse than that, because get and set do non-atomic memory access, so the lack of synchronization is UB.

If you aren't familiar with memory orderings, it might help to read the Memory Ordering chapter of Rust Atomics and Locks. The links from Memory model for atomic accesses to C++20 atomic orderings and the nomicon may also be helpful.

Hmm. Ok the problem is making more sense now. I was assuming that cache coherency would keep the reads up to date, but I guess on platforms that dow write through for atomics or something, that's not a guarantee. IIUC, that's why you were saying non-atomic, relaxed, and Aquire/Release were all implemented the same on x86.

(Thank's for your patience with me on this. These details are new to me, and it's a lot to get my head around.)

But I think, slapping Aquire/Release on everything would fix it, and that would be free for platforms like x86. Am I getting that right?

But I think, slapping Aquire/Release on everything would fix it, and that would be free for platforms like x86. Am I getting that right?

Right, that will get you most of the way there! I think you'll find that you also need a store-release after the read in try_spawn_empty_from_free, since you'll need a way to make sure that read happens before the slot is re-used by a free. That's why the other concurrent queue implementations have a "stamp".

It's really easy to miss cases, though! You'll want to be very precise with your safety comments. Ideally you'd use something like loom, which can help find bugs by trying lots of different interleavings of operations. I'm not sure how hard it would be to get that working with Bevy CI.

Ok that makes sense. I'll do that if we proceed here.

I have good news and bad news. The bad news is I'm working on v8. The good news is this should be much simpler to review and reason about.

I'm going to work on v8 for now, but if we end up going with this, I will polish this up. And thank's for showing me loom. That's really cool!

github-actions · 2025-03-30T01:33:06Z

It looks like your PR has been selected for a highlight in the next release blog post, but you didn't provide a release note.

Please review the instructions for writing release notes, then expand or revise the content in the release notes directory to showcase your changes.

ElliottjPierce · 2025-04-01T20:48:22Z

Closing in favor of v9: #18670. Thanks for the reviews here.

ElliottjPierce added 17 commits March 28, 2025 14:45

basic data layout

28ea233

make chunk capacities

02018e0

get_indices

ec5d443

comments

5cbe004

implemented chunk basics

5884e0c

spawn_empty

1ef361b

basics of Owned

8696af1

impl spawn_non_empty

ce228ea

fix free conflict

e550a23

impl make_non_empty

4faceab

clarifications

3acb63f

make_empty

694d01f

RemoteOwned

60645f8

small clean up

63a04f5

more safety comments

562d9bf

fix make_non_empty

5e90bfa

fix syntax issue

40e1b05

ElliottjPierce marked this pull request as draft March 29, 2025 19:48

chescock reviewed Mar 30, 2025

View reviewed changes

update safety comments

f9d72b6

ElliottjPierce closed this Apr 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Remote entity reservation v7 #18611

Remote entity reservation v7 #18611

Uh oh!

ElliottjPierce commented Mar 29, 2025

Uh oh!

chescock Mar 30, 2025

Uh oh!

ElliottjPierce Mar 30, 2025

Uh oh!

chescock Mar 30, 2025

Uh oh!

ElliottjPierce Mar 30, 2025

Uh oh!

chescock Mar 31, 2025

Uh oh!

ElliottjPierce Mar 31, 2025

Uh oh!

github-actions bot commented Mar 30, 2025

Uh oh!

ElliottjPierce commented Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!

Remote entity reservation v7 #18611

Remote entity reservation v7 #18611

Uh oh!

Conversation

ElliottjPierce commented Mar 29, 2025

Objective

Solution

Draft

Uh oh!

chescock Mar 30, 2025

Choose a reason for hiding this comment

Uh oh!

ElliottjPierce Mar 30, 2025

Choose a reason for hiding this comment

Uh oh!

chescock Mar 30, 2025

Choose a reason for hiding this comment

Uh oh!

ElliottjPierce Mar 30, 2025

Choose a reason for hiding this comment

Uh oh!

chescock Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

ElliottjPierce Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 30, 2025

Uh oh!

ElliottjPierce commented Apr 1, 2025

Uh oh!

Uh oh!