Skip to content

Remote entity reservation v7 #18611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ElliottjPierce
Copy link
Contributor

Objective

This is v7, so you get the idea. See also v6 (#18525). I'll fill this in more if we proceed here.

Solution

The full design plan is in #18577. A few things changed (actually for the better) as I wrote this.
Instead of accommodating remote reservation from the free list all the time, it is disable very briefly during some operations.
Namely, when freeing an entity or moving it out of the empty archetype, remote reservation will have to get a new brand new entity index. This is faster than the alternative. The only downside is that remote reservation may (very rarely) increase the total number of entities when one could be reused instead. But that would be rare, and it's better than blocking / waiting IMO.

Costs:

  • spawning an empty reusing freed list = 1 atomic op.
  • spawning an empty not reusing freed list = 2 atomic op.
  • remotely spawning an empty reusing freed list = 2 atomic op.
  • remotely spawning an empty not reusing freed list = 3 atomic op.
  • allocating a non-empty entity reusing freed list = 2 atomic op.
  • allocating a non-empty entity without reusing freed list = 3 atomic op.
  • moving archetype of reused entity from empty archetype = 1 atomic op if there's no pending entity, else 3 atomic ops.
  • moving an entity into the empty archetype = 2 atomic ops if there's nothing pending, else 4.
  • freeing a non-empty entity = 3 atomic ops

Future:

  • freeing an empty entity is currently moving its archetype and then freeing (6 ops average), but we can make a faster impl later.
  • Currently freeing, spawning, allocating (etc) can only do one entity at a time. On paper we can batch this. Ex: freeing 100 entities should in the long run be no more atomic ops than freeing 1.

Draft

This is ver much a draft to facilitate discussion. This does not interact with EntityMeta or the rest of the ECS yet.
Only the atomic aspects of the design have been built.

@ElliottjPierce ElliottjPierce marked this pull request as draft March 29, 2025 19:48
let target = head.add(index as usize);

// SAFETY: Ensured by caller.
unsafe { core::ptr::replace(target, Slot::new(entity)) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to be sound in the Rust memory model for atomic accesses, you need to prove that any other access to this slot either happens-before this one, or this one happens-before the other.

You need to update the SAFETY comments to show that's the case, and I believe that will be impossible if all of the operations are Ordering::Relaxed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated safety comments because I do think this is safe. set is never called concurrently and is never called with an index that could be considered valid by a remote reservation. In practice, this could theoretically just be an UnsafeCell.

Given that there's no concurrency or conflicts I don't think this violates the memory access model. I can't prove which sets will be in what order, but I can prove that they will be in a definite order. The set calls will never conflict with other set or get ops. And get is ok to conflict with other get calls because they don't do any modification. So I think safety is upheld and relaxed ordering is fine.

Is that right? You know a lot more about this than me, so I trust your judgement.

The only place I'm ify on Relaxed is the storing and loading of OwnedBuffer::len, but that is never written concurrently, and we guard agains early reads by temporarily disabling the free cursor when we decrease len, and a premature read has no affect if we are increasing it (RemoteOwned might miss the chance to reuse a freshly flushed entity, but who cares?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there's no concurrency or conflicts I don't think this violates the memory access model.

The concurrency comes from calling RemoteOwned::try_spawn_empty_from_free on one thread while calling methods like Owned::free_non_empty_buffer on another. (I thought that was the goal here; if I misunderstood then I apologize!)

You aren't doing any synchronization, so the order of operations one one thread doesn't affect the other. The set call in free_non_empty_buffer could be reordered after the final update to free_cursor, such that try_spawn_empty_from_free reads the new value of free_cursor but get returns an old value at that index. And again, it's worse than that, because get and set do non-atomic memory access, so the lack of synchronization is UB.

If you aren't familiar with memory orderings, it might help to read the Memory Ordering chapter of Rust Atomics and Locks. The links from Memory model for atomic accesses to C++20 atomic orderings and the nomicon may also be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Ok the problem is making more sense now. I was assuming that cache coherency would keep the reads up to date, but I guess on platforms that dow write through for atomics or something, that's not a guarantee. IIUC, that's why you were saying non-atomic, relaxed, and Aquire/Release were all implemented the same on x86.

(Thank's for your patience with me on this. These details are new to me, and it's a lot to get my head around.)

But I think, slapping Aquire/Release on everything would fix it, and that would be free for platforms like x86. Am I getting that right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think, slapping Aquire/Release on everything would fix it, and that would be free for platforms like x86. Am I getting that right?

Right, that will get you most of the way there! I think you'll find that you also need a store-release after the read in try_spawn_empty_from_free, since you'll need a way to make sure that read happens before the slot is re-used by a free. That's why the other concurrent queue implementations have a "stamp".

It's really easy to miss cases, though! You'll want to be very precise with your safety comments. Ideally you'd use something like loom, which can help find bugs by trying lots of different interleavings of operations. I'm not sure how hard it would be to get that working with Bevy CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that makes sense. I'll do that if we proceed here.

I have good news and bad news. The bad news is I'm working on v8. The good news is this should be much simpler to review and reason about.

I'm going to work on v8 for now, but if we end up going with this, I will polish this up. And thank's for showing me loom. That's really cool!

@alice-i-cecile alice-i-cecile added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events X-Controversial There is active debate or serious implications around merging this PR M-Needs-Release-Note Work that should be called out in the blog due to impact S-Needs-SME Decision or review from an SME is required labels Mar 30, 2025
Copy link
Contributor

It looks like your PR has been selected for a highlight in the next release blog post, but you didn't provide a release note.

Please review the instructions for writing release notes, then expand or revise the content in the release notes directory to showcase your changes.

@ElliottjPierce
Copy link
Contributor Author

Closing in favor of v9: #18670. Thanks for the reviews here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible M-Needs-Release-Note Work that should be called out in the blog due to impact S-Needs-SME Decision or review from an SME is required X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants