-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Improved Entity Lifecycle: remove flushing, support manual spawning and despawning #19451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved Entity Lifecycle: remove flushing, support manual spawning and despawning #19451
Conversation
it's not exact, but it should be good enough.
crates/bevy_ecs/src/bundle/insert.rs
Outdated
| swapped_entity.index(), | ||
| unsafe { entities.get_spawned(swapped_entity).debug_checked_unwrap() }; | ||
| entities.update_existing_location( | ||
| swapped_entity.row(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do indeed think this should be index and not row. Entities is not a Table functionally. It is an array. Yes every array in a certain light could be considered to be a single column table. But that is not how we approach that in the context of Bevy or Rust. Even in the context of Bevy ECS specifically I find it mismatched, as Table is a specific thing, and Entities is not that thing. I'd prefer to fix this everywhere in this PR, rather than spread the row terminology further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have lots to say here. If you want to unilaterally declare "no rows", I understand and will do that, but before I do that, I just want to explain myself a little bit here, because I much prefer row to index.
You mentioned that using row names makes it hard to untie it from Table storage. That's fair. I don't have a good answer for that.
But, in the very common "ecs as a spreadsheet" example, this makes a lot of sense. For people learning the ecs way, I think this name is much more approachable than index. Index naming ties the high level concept to the implementation of the Entities collection.
And that implementation is not unlikely to change. For example, with entity paging, this "index" becomes a key into a 2-layer array map. You mentioned elsewhere, I assume for the sake of argument, that Entity might become a UUID, and that should be possible without needing to change names. In that case, the EntityRow would be the key in a hashmap, not an index in an array.
I guess what I'm saying is that using the index name ties the name to its implementation details, an index in an array, rather than the high-level concept, a row in a spreadsheet. And that seems contrary to your motivations behind other naming suggestions, most of which I agree with.
Again, I'll happily rename everything to index. I just don't want that to come back to bite us latter if we do entity paging, for example, and people start wondering "Why is this called index? It's not an index into anything.".
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remain unconvinced. The "ECS is just tables" statement is perhaps a helpful lie, but it is not the truth of the matter. Bevy ECS is partially tables and partially other things, and that will continue to be true. Given that it is partially real tables, pretending everything is tables makes the internals unnecessarily confusing, and it makes communicating how the system actually works to developers more challenging. That being said, even if we were to choose to tell that lie, I'll assert that this naming scheme is doing it wrong.
First: what is a "row" in a table? It is a specific entry in the table. A "row identifier" is a unique name for the row that can be stored (ex: the "first" row might be row A, row 0, or row 1, depending on the naming scheme). In our case, our naming scheme is "array indices". If we are storing the "row identifier", we are storing the array index of the row. If I have a field or type called row, I expect it to be the "row identifier".
Calling it Entity::row, makes no sense because it is not the row identifier. It is a unique index that identifies the entity, not the row it is stored in. The row an entity is stored in can be constantly changing, while the entity's index remains unchanged. That index points to a location in an array that stores arbitrary information about that entity, which includes the archetype it is stored in, the table, etc. There is a world where that entity metadata stores multiple table rows, if we decide to support grouping some sets of components into separate tables (for performance reasons ... Sander and I have discussed this at various points and Flecs might actually already support it).
Even in the "ECS is just tables" / "ECS is just a database" world, Entity::index would be the "primary key" stored as a column in a row, not the "row":
Table
| Entity | ComponentA | ComponentB |
|---|---|---|
| 4 | A("hello") | B("world") |
| 9 | A("foo") | B("bar") |
Sparse Set
| Entity | ComponentC |
|---|---|
| 9 | C("I'm sparsely stored!") |
Where that primary key then has an acceleration structure to find it's locations (filling exactly the same role that Entities fills)
The index name ties the name to its implementation details
Again, I'll happily rename everything to index. I just don't want that to come back to bite us latter if we do entity paging, for example, and people start wondering "Why is this called index? It's not an index into anything.".
I agree that index is an implementation detail. If someone is relying on Entity::index, they are relying on a Bevy ECS implementation detail. If we change that to include paging, or to use a UUID, the consumer of index in many cases will need to contend with that. A contract (and perhaps naming) change is warranted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to construct vs spawn_at, I still very much disagree here but am also ok with you're decision.
For one, I think you're conflating tables + sparse sets as implementation and ecs as a database with "think of an ecs as a spreadsheet." The row in the spreadsheet has nothing to do with how the spreadsheet is represented in memory or how it is implemented. The "row in a spreadsheet" is purely allegorical to help users. I think it is the most common mental framework for thinking of an ecs (again, not in implementation but in concept). At least that's how I was introduced to it and still think about ecs philosophy. I think that's very common, and is the typical mental framework that users are coming from. The rest of the ecs is just figuring out how to make reading the spreadsheet and finding the components of a row in the spreadsheet faster than it has any right to be.
I'm also still concerned about tying the names to implementation details instead of higher level concepts, especially when you've argued for the exact opposite for other names. (I don't mean to attack you here. But I am confused.) Those names were more user-focused. Maybe that's the difference? IDK. Maybe I'm not understanding, but what's confusing me a little is that I'm not sure what criteria you're using to determine implementation-centered names vs concept-centered names. And that makes it much harder (for me) to follow your logic.
Anyway, I don't think I'm going to convince you, and that's ok.
If I understand correctly, you want this name in particular to reflect how Entities is implemented. And that means renaming to index for now and then later, with entity paging, maybe renaming again to "key" or "entry" or something.
Assuming that's correct, I'll finish this up either latter tonight or tomorrow. But do let me know otherwise. And I'd also still like some help (just for future reference and my own learning) understanding the criteria you're using to determine implementation-centered names vs concept-centered names. I don't have a good mental picture for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also still concerned about tying the names to implementation details instead of higher level concepts,
I'd also still like some help (just for future reference and my own learning) understanding the criteria you're using to determine implementation-centered names vs concept-centered names
The distinction is that Entity and Entity::index are at different levels of abstraction, and abstractions have to end somewhere. From my perspective, Entity should be an opaque unique primary key, which people can use as such without worrying about whats going on inside. The internals are an implementation detail, and therefore benefit from being functionally descriptive. People can reach in to look at what is inside, but they are just "internals", and generally should not be depended in user code.
This is of course an art and not a science. We could decide to abstract over the internals of Entity, with the goal to make them stable. But I don't really see the point of doing that. The cost of functional clarity and additional layers of abstraction isn't worth the stability, at that level, as we already have a higher level abstraction providing that stability.
I still very much disagree here but am also ok with you're decision.
Cool lets move forward with the "index' terminology, and cut the "ecs as spreadsheets" section for now. I'm not fully against using the "ECS is like a spreadsheet" angle as a teaching tool, but I still strongly object to the Entity == Row angle, and I really don't want to block this PR on that conversation.
crates/bevy_ecs/src/world/mod.rs
Outdated
| } | ||
| // The caller wants the entity to be left despawned and in the allocator. In this case, we can just skip the despawning part. | ||
| Err(EntityMutableFetchError::NotSpawned(EntityNotSpawnedError::RowNotSpawned(_))) => { | ||
| self.allocator.free(entity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the entity is not spawned, that might be because it was already despawned elsewhere right? This would double free(), allowing it to be claimed more than once (bad!). Additionally, multiple calls to this would result in multiple free() calls. I think we should only pair free() directly with actual successful despawn operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments to make this more clear, but I'm pretty sure this is actually correct.
If the entity was previously despawned, its generation would be incremented, so the error variant would be Invalid, not NotSpawned. The only way this can happen is if despawn_no_free is was called, and then this is called to say "despawn it", as in, make it not spawned and freed. In this case, it's already not spawned, but it isn't freed, so we just free it. This will never double free from someone despawning the same entity twice.
The only way this could behave weird is if there is some nasty id aliasing, and something despawns 0vu32::MAX making it 0v0 and then something tries to despawn the original 0v0, which would trigger a double free. I don't think that's a big issue though because id aliasing is really rare, already can trigger lots of bugs, and is heavily documented with ways to avoid it altogether.
So if you want to change the behavior so that if despawn_no_free is called, the id can only be manually freed, I'm happy to do that. But I don't think this implementation is wrong, unless you want to change what it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you want to change the behavior so that if despawn_no_free is called, the id can only be manually freed, I'm happy to do that
Yeah I think that is the move / that is the correct mental model. despawn is both a "despawn" operation and a "free" operation. The caller is asking to do both. We cannot despawn, so that should result in an error, just like it does in the other case.
# Objective This is a tiny clean up to #19451 that removes some now completely unneeded public constants. I mean, they aren't wrong, but there's no point to their existence. ## Solution Removed `ArchetypeRow::INVALID` and `ArchetypeId::INVALID` and extend migration guide.
Objective
This is the next step for #19430 and is also convinient for #18670.
For context, the way entities work on main is as a "allocate and use" system. Entity ids are allocated, and given a location. The location can then be changed, etc. Entities that are free have an invalid location. To allocate an entity, one must also set its location. This introduced the need for pending entities, where an entity would be reserved, pending, and at some point flushed. Pending and free entities have an invalid location, and others are assumed to have a valid one.
This paradigm has a number of downsides: First, the entities metadata table is inseparable from the allocator, which makes remote reservation challenging. Second, the
Worldmust be flushed, even to do simple things, like allocate a temporary entity id. Third, users have little control over entity ids, only interacting with conceptual entities. This made things likeEntities::alloc_atclunky and slow, leading to its removal, despite some users still having valid need of it.So the goal of this PR is to:
Entitiesfrom entity allocation to make room for other allocators and resolvealloc_atissues.reserveandflushpatterns toallocandconstructpatterns.It is possible to break this up into multiple prs, as I originally intended, but doing so would require lots of temporary scaffolding that would both hurt performance and make things harder to review.
Solution
This solution builds on #19433, which changed the representation of invalid entity locations from a constant to
None.There's quite a few steps to this, each somewhat controversial:
Entities with no location
This pr introduces the idea of entity rows both with and without locations. This corresponds to entities that are constructed (the row has a location) and not constructed (the row has no location). When a row is free or pending, it is not constructed. When a row is outside the range of the meta list, it still exists; it's just not constructed.
This extends to conceptual entities; conceptual entities may now be in one of 3 states: empty (constructed; no components), normal (constructed; 1 or more components), or null (not constructed). This extends to entity pointers (
EntityWorldMut, etc): These now can point to "null"/not constructed entities. Depending on the privilege of the pointer, these can also construct or destruct the entity.This also changes how
Entityids relate to conceptual entities. AnEntitynow exists if its generation matches that of its row. AnEntitythat has the right generation for its row will claim to exist, even if it is not constructed. This means, for example, anEntitymanually constructed with a large index and generation of 0 will exist if it has not been allocated yet.Entitiesis separate from the allocatorThis pr separates entity allocation from
Entities.Entitiesis now only focused on tracking entity metadata, etc. The newEntitiesAllocatoronWorldmanages all allocations. This forcesEntitiesto not rely on allocator state to determine if entities exist, etc, which is convinient for remote reservation and needed for custom allocators. It also paves the way for allocators not housed within theWorld, makes some unsafe code easier since the allocator and metadata live under different pointers, etc.This separation requires thinking about interactions with
Entitiesin a new way. Previously, theEntitiesset the rules for what entities are valid and what entities are not. Now, it has no way of knowing. Instead, interaction withEntitiesare more like declaring some information for it to track than changing some information it was already tracking. To reflect this,sethas been split up intodeclareandupdate.Constructing and destructing
As mentioned, entities that have no location (not constructed) can be constructed at any time. This takes on exactly the same meaning as the previous
spawn_non_existent. It creates/declares a location instead of updating an old one. As an example, this makes spawning an entity now literately just allocate a new id and construct it immediately.Conversely, entities that are constructed may be destructed. This removes all components and despawns related entities, just like
despawn. The only difference is that destructing does not free the entity id for reuse. Between constructing and destructing, all needs foralloc_atare resolved. If you want to keep the id for custom reuse, just destruct instead of despawn! Despawn, now just destructs the entity and frees it.Destructing a not constructed entity will do nothing. Constructing an already constructed entity will panic. This is to guard against users constructing a manually formed
Entitythat the allocator could later hand out. However, public construction methods have proper error handling for this. Despawning a not constructed entity just frees its id.No more flushing
All places that once needed to reserve and flush entity ids now allocate and construct them instead. This improves performance and simplifies things.
Flow chart
(Thanks @ItsDoot)
Testing
Showcase
Here's an example of constructing and destructing
Future Work
Entitydoesn't always correspond to a conceptual entity.EntityWorldMut. There is (and was) a lot of assuming the entity is constructed there (was assuming it was not despawned).Performance
Benchmarks
This roughly doubles command spawning speed! Despawning also sees a 20-30% improvement. Dummy commands improve by 10-50% (due to not needing an entity flush). Other benchmarks seem to be noise and are negligible. It looks to me like a massive performance win!