From 3721bb1c71ee59a1a526eeaf32a65e1515b24806 Mon Sep 17 00:00:00 2001 From: Robert Swain Date: Wed, 27 Sep 2023 10:28:28 +0200 Subject: [PATCH] Use EntityHashMap for render world entity storage for better performance (#9903) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit # Objective - Improve rendering performance, particularly by avoiding the large system commands costs of using the ECS in the way that the render world does. ## Solution - Define `EntityHasher` that calculates a hash from the `Entity.to_bits()` by `i | (i.wrapping_mul(0x517cc1b727220a95) << 32)`. `0x517cc1b727220a95` is something like `u64::MAX / N` for N that gives a value close to π and that works well for hashing. Thanks for @SkiFire13 for the suggestion and to @nicopap for alternative suggestions and discussion. This approach comes from `rustc-hash` (a.k.a. `FxHasher`) with some tweaks for the case of hashing an `Entity`. `FxHasher` and `SeaHasher` were also tested but were significantly slower. - Define `EntityHashMap` type that uses the `EntityHashser` - Use `EntityHashMap` for render world entity storage, including: - `RenderMaterialInstances` - contains the `AssetId` of the material associated with the entity. Also for 2D. - `RenderMeshInstances` - contains mesh transforms, flags and properties about mesh entities. Also for 2D. - `SkinIndices` and `MorphIndices` - contains the skin and morph index for an entity, respectively - `ExtractedSprites` - `ExtractedUiNodes` ## Benchmarks All benchmarks have been conducted on an M1 Max connected to AC power. The tests are run for 1500 frames. The 1000th frame is captured for comparison to check for visual regressions. There were none. ### 2D Meshes `bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d` #### `--ordered-z` This test spawns the 2D meshes with z incrementing back to front, which is the ideal arrangement allocation order as it matches the sorted render order which means lookups have a high cache hit rate. Screenshot 2023-09-27 at 07 50 45 -39.1% median frame time. #### Random This test spawns the 2D meshes with random z. This not only makes the batching and transparent 2D pass lookups get a lot of cache misses, it also currently means that the meshes are almost certain to not be batchable. Screenshot 2023-09-27 at 07 51 28 -7.2% median frame time. ### 3D Meshes `many_cubes --benchmark` Screenshot 2023-09-27 at 07 51 57 -7.7% median frame time. ### Sprites **NOTE: On `main` sprites are using `SparseSet`!** `bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite` #### `--ordered-z` This test spawns the sprites with z incrementing back to front, which is the ideal arrangement allocation order as it matches the sorted render order which means lookups have a high cache hit rate. Screenshot 2023-09-27 at 07 52 31 +13.0% median frame time. #### Random This test spawns the sprites with random z. This makes the batching and transparent 2D pass lookups get a lot of cache misses. Screenshot 2023-09-27 at 07 53 01 +0.6% median frame time. ### UI **NOTE: On `main` UI is using `SparseSet`!** `many_buttons` Screenshot 2023-09-27 at 07 53 26 +15.1% median frame time. ## Alternatives - Cart originally suggested trying out `SparseSet` and indeed that is slightly faster under ideal conditions. However, `PassHashMap` has better worst case performance when data is randomly distributed, rather than in sorted render order, and does not have the worst case memory usage that `SparseSet`'s dense `Vec` that maps from the `Entity` index to sparse index into `Vec`. This dense `Vec` has to be as large as the largest Entity index used with the `SparseSet`. - I also tested `PassHashMap`, intending to use `Entity.index()` as the key, but this proved to sometimes be slower and mostly no different. - The only outstanding approach that has not been implemented and tested is to _not_ clear the render world of its entities each frame. That has its own problems, though they could perhaps be solved. - Performance-wise, if the entities and their component data were not cleared, then they would incur table moves on spawn, and should not thereafter, rather just their component data would be overwritten. Ideally we would have a neat way of either updating data in-place via `&mut T` queries, or inserting components if not present. This would likely be quite cumbersome to have to remember to do everywhere, but perhaps it only needs to be done in the more performance-sensitive systems. - The main problem to solve however is that we want to both maintain a mapping between main world entities and render world entities, be able to run the render app and world in parallel with the main app and world for pipelined rendering, and at the same time be able to spawn entities in the render world in such a way that those Entity ids do not collide with those spawned in the main world. This is potentially quite solvable, but could well be a lot of ECS work to do it in a way that makes sense. --- ## Changelog - Changed: Component data for entities to be drawn are no longer stored on entities in the render world. Instead, data is stored in a `EntityHashMap` in various resources. This brings significant performance benefits due to the way the render app clears entities every frame. Resources of most interest are `RenderMeshInstances` and `RenderMaterialInstances`, and their 2D counterparts. ## Migration Guide Previously the render app extracted mesh entities and their component data from the main world and stored them as entities and components in the render world. Now they are extracted into essentially `EntityHashMap` where `T` are structs containing an appropriate group of data. This means that while extract set systems will continue to run extract queries against the main world they will store their data in hash maps. Also, systems in later sets will either need to look up entities in the available resources such as `RenderMeshInstances`, or maintain their own `EntityHashMap` for their own data. Before: ```rust fn queue_custom( material_meshes: Query<(Entity, &MeshTransforms, &Handle), With>, ) { ... for (entity, mesh_transforms, mesh_handle) in &material_meshes { ... } } ``` After: ```rust fn queue_custom( render_mesh_instances: Res, instance_entities: Query>, ) { ... for entity in &instance_entities { let Some(mesh_instance) = render_mesh_instances.get(&entity) else { continue; }; // The mesh handle in `AssetId` form, and the `MeshTransforms` can now // be found in `mesh_instance` which is a `RenderMeshInstance` ... } } ``` --------- Co-authored-by: robtfm <50659922+robtfm@users.noreply.github.com> --- crates/bevy_ecs/src/entity/mod.rs | 11 +- crates/bevy_pbr/src/material.rs | 95 +++++----- crates/bevy_pbr/src/prepass/mod.rs | 25 +-- crates/bevy_pbr/src/render/light.rs | 21 ++- crates/bevy_pbr/src/render/mesh.rs | 220 +++++++++++++++------- crates/bevy_pbr/src/render/morph.rs | 31 ++- crates/bevy_pbr/src/render/skin.rs | 30 ++- crates/bevy_pbr/src/wireframe.rs | 65 ++++--- crates/bevy_render/src/batching/mod.rs | 30 +-- crates/bevy_sprite/src/mesh2d/material.rs | 94 ++++----- crates/bevy_sprite/src/mesh2d/mesh.rs | 154 +++++++++------ crates/bevy_sprite/src/render/mod.rs | 7 +- crates/bevy_ui/src/render/mod.rs | 7 +- crates/bevy_utils/src/lib.rs | 52 +++++ examples/shader/shader_instancing.rs | 59 +++--- examples/stress_tests/bevymark.rs | 3 +- examples/stress_tests/many_cubes.rs | 4 +- 17 files changed, 584 insertions(+), 324 deletions(-) diff --git a/crates/bevy_ecs/src/entity/mod.rs b/crates/bevy_ecs/src/entity/mod.rs index 3a49c99aedcf6..677477680f10c 100644 --- a/crates/bevy_ecs/src/entity/mod.rs +++ b/crates/bevy_ecs/src/entity/mod.rs @@ -44,7 +44,7 @@ use crate::{ storage::{SparseSetIndex, TableId, TableRow}, }; use serde::{Deserialize, Serialize}; -use std::{convert::TryFrom, fmt, mem, sync::atomic::Ordering}; +use std::{convert::TryFrom, fmt, hash::Hash, mem, sync::atomic::Ordering}; #[cfg(target_has_atomic = "64")] use std::sync::atomic::AtomicI64 as AtomicIdCursor; @@ -115,12 +115,19 @@ type IdCursor = isize; /// [`EntityCommands`]: crate::system::EntityCommands /// [`Query::get`]: crate::system::Query::get /// [`World`]: crate::world::World -#[derive(Clone, Copy, Eq, Hash, Ord, PartialEq, PartialOrd)] +#[derive(Clone, Copy, Eq, Ord, PartialEq, PartialOrd)] pub struct Entity { generation: u32, index: u32, } +impl Hash for Entity { + #[inline] + fn hash(&self, state: &mut H) { + self.to_bits().hash(state); + } +} + pub(crate) enum AllocAtWithoutReplacement { Exists(EntityLocation), DidNotExist, diff --git a/crates/bevy_pbr/src/material.rs b/crates/bevy_pbr/src/material.rs index d9c835abcaffa..c3287af0d9cbf 100644 --- a/crates/bevy_pbr/src/material.rs +++ b/crates/bevy_pbr/src/material.rs @@ -1,6 +1,6 @@ use crate::{ render, AlphaMode, DrawMesh, DrawPrepass, EnvironmentMapLight, MeshPipeline, MeshPipelineKey, - MeshTransforms, PrepassPipelinePlugin, PrepassPlugin, ScreenSpaceAmbientOcclusionSettings, + PrepassPipelinePlugin, PrepassPlugin, RenderMeshInstances, ScreenSpaceAmbientOcclusionSettings, SetMeshBindGroup, SetMeshViewBindGroup, Shadow, }; use bevy_app::{App, Plugin}; @@ -14,10 +14,7 @@ use bevy_core_pipeline::{ use bevy_derive::{Deref, DerefMut}; use bevy_ecs::{ prelude::*, - system::{ - lifetimeless::{Read, SRes}, - SystemParamItem, - }, + system::{lifetimeless::SRes, SystemParamItem}, }; use bevy_render::{ mesh::{Mesh, MeshVertexBufferLayout}, @@ -37,7 +34,7 @@ use bevy_render::{ view::{ExtractedView, Msaa, ViewVisibility, VisibleEntities}, Extract, ExtractSchedule, Render, RenderApp, RenderSet, }; -use bevy_utils::{tracing::error, HashMap, HashSet}; +use bevy_utils::{tracing::error, EntityHashMap, HashMap, HashSet}; use std::hash::Hash; use std::marker::PhantomData; @@ -190,6 +187,7 @@ where .add_render_command::>() .init_resource::>() .init_resource::>() + .init_resource::>() .init_resource::>>() .add_systems( ExtractSchedule, @@ -226,26 +224,6 @@ where } } -fn extract_material_meshes( - mut commands: Commands, - mut previous_len: Local, - query: Extract)>>, -) { - let mut values = Vec::with_capacity(*previous_len); - for (entity, view_visibility, material) in &query { - if view_visibility.get() { - // NOTE: MaterialBindGroupId is inserted here to avoid a table move. Upcoming changes - // to use SparseSet for render world entity storage will do this automatically. - values.push(( - entity, - (material.clone_weak(), MaterialBindGroupId::default()), - )); - } - } - *previous_len = values.len(); - commands.insert_or_spawn_batch(values); -} - /// A key uniquely identifying a specialized [`MaterialPipeline`]. pub struct MaterialPipelineKey { pub mesh_key: MeshPipelineKey, @@ -368,24 +346,53 @@ type DrawMaterial = ( /// Sets the bind group for a given [`Material`] at the configured `I` index. pub struct SetMaterialBindGroup(PhantomData); impl RenderCommand

for SetMaterialBindGroup { - type Param = SRes>; + type Param = (SRes>, SRes>); type ViewWorldQuery = (); - type ItemWorldQuery = Read>; + type ItemWorldQuery = (); #[inline] fn render<'w>( - _item: &P, + item: &P, _view: (), - material_handle: &'_ Handle, - materials: SystemParamItem<'w, '_, Self::Param>, + _item_query: (), + (materials, material_instances): SystemParamItem<'w, '_, Self::Param>, pass: &mut TrackedRenderPass<'w>, ) -> RenderCommandResult { - let material = materials.into_inner().get(&material_handle.id()).unwrap(); + let materials = materials.into_inner(); + let material_instances = material_instances.into_inner(); + + let Some(material_asset_id) = material_instances.get(&item.entity()) else { + return RenderCommandResult::Failure; + }; + let Some(material) = materials.get(material_asset_id) else { + return RenderCommandResult::Failure; + }; pass.set_bind_group(I, &material.bind_group, &[]); RenderCommandResult::Success } } +#[derive(Resource, Deref, DerefMut)] +pub struct RenderMaterialInstances(EntityHashMap>); + +impl Default for RenderMaterialInstances { + fn default() -> Self { + Self(Default::default()) + } +} + +fn extract_material_meshes( + mut material_instances: ResMut>, + query: Extract)>>, +) { + material_instances.clear(); + for (entity, view_visibility, handle) in &query { + if view_visibility.get() { + material_instances.insert(entity, handle.id()); + } + } +} + const fn alpha_mode_pipeline_key(alpha_mode: AlphaMode) -> MeshPipelineKey { match alpha_mode { // Premultiplied and Add share the same pipeline key @@ -424,12 +431,8 @@ pub fn queue_material_meshes( msaa: Res, render_meshes: Res>, render_materials: Res>, - mut material_meshes: Query<( - &Handle, - &mut MaterialBindGroupId, - &Handle, - &MeshTransforms, - )>, + mut render_mesh_instances: ResMut, + render_material_instances: Res>, images: Res>, mut views: Query<( &ExtractedView, @@ -493,15 +496,16 @@ pub fn queue_material_meshes( } let rangefinder = view.rangefinder3d(); for visible_entity in &visible_entities.entities { - let Ok((material_handle, mut material_bind_group_id, mesh_handle, mesh_transforms)) = - material_meshes.get_mut(*visible_entity) - else { + let Some(material_asset_id) = render_material_instances.get(visible_entity) else { + continue; + }; + let Some(mesh_instance) = render_mesh_instances.get_mut(visible_entity) else { continue; }; - let Some(mesh) = render_meshes.get(mesh_handle) else { + let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else { continue; }; - let Some(material) = render_materials.get(&material_handle.id()) else { + let Some(material) = render_materials.get(material_asset_id) else { continue; }; let mut mesh_key = view_key; @@ -530,9 +534,10 @@ pub fn queue_material_meshes( } }; - *material_bind_group_id = material.get_bind_group_id(); + mesh_instance.material_bind_group_id = material.get_bind_group_id(); - let distance = rangefinder.distance_translation(&mesh_transforms.transform.translation) + let distance = rangefinder + .distance_translation(&mesh_instance.transforms.transform.translation) + material.properties.depth_bias; match material.properties.alpha_mode { AlphaMode::Opaque => { diff --git a/crates/bevy_pbr/src/prepass/mod.rs b/crates/bevy_pbr/src/prepass/mod.rs index 9169d1083fde1..372757f935148 100644 --- a/crates/bevy_pbr/src/prepass/mod.rs +++ b/crates/bevy_pbr/src/prepass/mod.rs @@ -47,7 +47,8 @@ use bevy_utils::tracing::error; use crate::{ prepare_materials, setup_morph_and_skinning_defs, AlphaMode, DrawMesh, Material, MaterialPipeline, MaterialPipelineKey, MeshLayouts, MeshPipeline, MeshPipelineKey, - MeshTransforms, RenderMaterials, SetMaterialBindGroup, SetMeshBindGroup, + RenderMaterialInstances, RenderMaterials, RenderMeshInstances, SetMaterialBindGroup, + SetMeshBindGroup, }; use std::{hash::Hash, marker::PhantomData}; @@ -758,8 +759,9 @@ pub fn queue_prepass_material_meshes( pipeline_cache: Res, msaa: Res, render_meshes: Res>, + render_mesh_instances: Res, render_materials: Res>, - material_meshes: Query<(&Handle, &Handle, &MeshTransforms)>, + render_material_instances: Res>, mut views: Query<( &ExtractedView, &VisibleEntities, @@ -804,16 +806,16 @@ pub fn queue_prepass_material_meshes( let rangefinder = view.rangefinder3d(); for visible_entity in &visible_entities.entities { - let Ok((material_handle, mesh_handle, mesh_transforms)) = - material_meshes.get(*visible_entity) - else { + let Some(material_asset_id) = render_material_instances.get(visible_entity) else { continue; }; - - let (Some(material), Some(mesh)) = ( - render_materials.get(&material_handle.id()), - render_meshes.get(mesh_handle), - ) else { + let Some(mesh_instance) = render_mesh_instances.get(visible_entity) else { + continue; + }; + let Some(material) = render_materials.get(material_asset_id) else { + continue; + }; + let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else { continue; }; @@ -849,7 +851,8 @@ pub fn queue_prepass_material_meshes( } }; - let distance = rangefinder.distance_translation(&mesh_transforms.transform.translation) + let distance = rangefinder + .distance_translation(&mesh_instance.transforms.transform.translation) + material.properties.depth_bias; match alpha_mode { AlphaMode::Opaque => { diff --git a/crates/bevy_pbr/src/render/light.rs b/crates/bevy_pbr/src/render/light.rs index c476afc76e1f6..feec375ddc3d3 100644 --- a/crates/bevy_pbr/src/render/light.rs +++ b/crates/bevy_pbr/src/render/light.rs @@ -3,10 +3,9 @@ use crate::{ CascadeShadowConfig, Cascades, CascadesVisibleEntities, Clusters, CubemapVisibleEntities, DirectionalLight, DirectionalLightShadowMap, DrawPrepass, EnvironmentMapLight, GlobalVisiblePointLights, Material, MaterialPipelineKey, MeshPipeline, MeshPipelineKey, - NotShadowCaster, PointLight, PointLightShadowMap, PrepassPipeline, RenderMaterials, SpotLight, - VisiblePointLights, + PointLight, PointLightShadowMap, PrepassPipeline, RenderMaterialInstances, RenderMaterials, + RenderMeshInstances, SpotLight, VisiblePointLights, }; -use bevy_asset::Handle; use bevy_core_pipeline::core_3d::Transparent3d; use bevy_ecs::prelude::*; use bevy_math::{Mat4, UVec3, UVec4, Vec2, Vec3, Vec3Swizzles, Vec4, Vec4Swizzles}; @@ -1553,9 +1552,10 @@ pub fn prepare_clusters( pub fn queue_shadows( shadow_draw_functions: Res>, prepass_pipeline: Res>, - casting_meshes: Query<(&Handle, &Handle), Without>, render_meshes: Res>, + render_mesh_instances: Res, render_materials: Res>, + render_material_instances: Res>, mut pipelines: ResMut>>, pipeline_cache: Res, view_lights: Query<(Entity, &ViewLightEntities)>, @@ -1598,15 +1598,22 @@ pub fn queue_shadows( // NOTE: Lights with shadow mapping disabled will have no visible entities // so no meshes will be queued for entity in visible_entities.iter().copied() { - let Ok((mesh_handle, material_handle)) = casting_meshes.get(entity) else { + let Some(mesh_instance) = render_mesh_instances.get(&entity) else { continue; }; - let Some(mesh) = render_meshes.get(mesh_handle) else { + if !mesh_instance.shadow_caster { + continue; + } + let Some(material_asset_id) = render_material_instances.get(&entity) else { continue; }; - let Some(material) = render_materials.get(&material_handle.id()) else { + let Some(material) = render_materials.get(material_asset_id) else { continue; }; + let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else { + continue; + }; + let mut mesh_key = MeshPipelineKey::from_primitive_topology(mesh.primitive_topology) | MeshPipelineKey::DEPTH_PREPASS; diff --git a/crates/bevy_pbr/src/render/mesh.rs b/crates/bevy_pbr/src/render/mesh.rs index 992ae4cf3fa7d..583475bf9a2e2 100644 --- a/crates/bevy_pbr/src/render/mesh.rs +++ b/crates/bevy_pbr/src/render/mesh.rs @@ -5,7 +5,7 @@ use crate::{ ViewClusterBindings, ViewFogUniformOffset, ViewLightsUniformOffset, ViewShadowBindings, CLUSTERED_FORWARD_STORAGE_BUFFER_COUNT, MAX_CASCADES_PER_LIGHT, MAX_DIRECTIONAL_LIGHTS, }; -use bevy_app::Plugin; +use bevy_app::{Plugin, PostUpdate}; use bevy_asset::{load_internal_asset, AssetId, Handle}; use bevy_core_pipeline::{ core_3d::{AlphaMask3d, Opaque3d, Transparent3d}, @@ -14,6 +14,7 @@ use bevy_core_pipeline::{ get_lut_bind_group_layout_entries, get_lut_bindings, Tonemapping, TonemappingLuts, }, }; +use bevy_derive::{Deref, DerefMut}; use bevy_ecs::{ prelude::*, query::{QueryItem, ROQueryItem}, @@ -21,7 +22,10 @@ use bevy_ecs::{ }; use bevy_math::{Affine3, Vec2, Vec4}; use bevy_render::{ - batching::{batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData}, + batching::{ + batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData, + NoAutomaticBatching, + }, globals::{GlobalsBuffer, GlobalsUniform}, mesh::{ GpuBufferInfo, InnerMeshVertexBufferLayout, Mesh, MeshVertexBufferLayout, @@ -40,14 +44,18 @@ use bevy_render::{ Extract, ExtractSchedule, Render, RenderApp, RenderSet, }; use bevy_transform::components::GlobalTransform; -use bevy_utils::{tracing::error, HashMap, Hashed}; +use bevy_utils::{tracing::error, EntityHashMap, HashMap, Hashed}; use crate::render::{ - morph::{extract_morphs, prepare_morphs, MorphIndex, MorphUniform}, - skin::{extract_skins, prepare_skins, SkinIndex, SkinUniform}, + morph::{ + extract_morphs, no_automatic_morph_batching, prepare_morphs, MorphIndices, MorphUniform, + }, + skin::{extract_skins, no_automatic_skin_batching, prepare_skins, SkinUniform}, MeshLayouts, }; +use super::skin::SkinIndices; + #[derive(Default)] pub struct MeshRenderPlugin; @@ -102,11 +110,19 @@ impl Plugin for MeshRenderPlugin { load_internal_asset!(app, SKINNING_HANDLE, "skinning.wgsl", Shader::from_wgsl); load_internal_asset!(app, MORPH_HANDLE, "morph.wgsl", Shader::from_wgsl); + app.add_systems( + PostUpdate, + (no_automatic_skin_batching, no_automatic_morph_batching), + ); + if let Ok(render_app) = app.get_sub_app_mut(RenderApp) { render_app + .init_resource::() .init_resource::() .init_resource::() + .init_resource::() .init_resource::() + .init_resource::() .add_systems( ExtractSchedule, (extract_meshes, extract_skins, extract_morphs), @@ -212,10 +228,24 @@ bitflags::bitflags! { } } +pub struct RenderMeshInstance { + pub transforms: MeshTransforms, + pub mesh_asset_id: AssetId, + pub material_bind_group_id: MaterialBindGroupId, + pub shadow_caster: bool, + pub automatic_batching: bool, +} + +#[derive(Default, Resource, Deref, DerefMut)] +pub struct RenderMeshInstances(EntityHashMap); + +#[derive(Component)] +pub struct Mesh3d; + pub fn extract_meshes( mut commands: Commands, - mut prev_caster_commands_len: Local, - mut prev_not_caster_commands_len: Local, + mut previous_len: Local, + mut render_mesh_instances: ResMut, meshes_query: Extract< Query<( Entity, @@ -225,15 +255,25 @@ pub fn extract_meshes( &Handle, Option>, Option>, + Has, )>, >, ) { - let mut caster_commands = Vec::with_capacity(*prev_caster_commands_len); - let mut not_caster_commands = Vec::with_capacity(*prev_not_caster_commands_len); + render_mesh_instances.clear(); + let mut entities = Vec::with_capacity(*previous_len); + let visible_meshes = meshes_query.iter().filter(|(_, vis, ..)| vis.get()); - for (entity, _, transform, previous_transform, handle, not_receiver, not_caster) in - visible_meshes + for ( + entity, + _, + transform, + previous_transform, + handle, + not_receiver, + not_caster, + no_automatic_batching, + ) in visible_meshes { let transform = transform.affine(); let previous_transform = previous_transform.map(|t| t.0).unwrap_or(transform); @@ -250,16 +290,22 @@ pub fn extract_meshes( previous_transform: (&previous_transform).into(), flags: flags.bits(), }; - if not_caster.is_some() { - not_caster_commands.push((entity, (handle.clone_weak(), transforms, NotShadowCaster))); - } else { - caster_commands.push((entity, (handle.clone_weak(), transforms))); - } + // FIXME: Remove this - it is just a workaround to enable rendering to work as + // render commands require an entity to exist at the moment. + entities.push((entity, Mesh3d)); + render_mesh_instances.insert( + entity, + RenderMeshInstance { + mesh_asset_id: handle.id(), + transforms, + shadow_caster: not_caster.is_none(), + material_bind_group_id: MaterialBindGroupId::default(), + automatic_batching: !no_automatic_batching, + }, + ); } - *prev_caster_commands_len = caster_commands.len(); - *prev_not_caster_commands_len = not_caster_commands.len(); - commands.insert_or_spawn_batch(caster_commands); - commands.insert_or_spawn_batch(not_caster_commands); + *previous_len = entities.len(); + commands.insert_or_spawn_batch(entities); } #[derive(Resource, Clone)] @@ -545,22 +591,26 @@ impl MeshPipeline { } impl GetBatchData for MeshPipeline { - type Query = ( - Option<&'static MaterialBindGroupId>, - &'static Handle, - &'static MeshTransforms, - ); - type CompareData = (Option, AssetId); + type Param = SRes; + type Query = Entity; + type QueryFilter = With; + type CompareData = (MaterialBindGroupId, AssetId); type BufferData = MeshUniform; - fn get_buffer_data(&(.., mesh_transforms): &QueryItem) -> Self::BufferData { - mesh_transforms.into() - } - - fn get_compare_data( - &(material_bind_group_id, mesh_handle, ..): &QueryItem, - ) -> Self::CompareData { - (material_bind_group_id.copied(), mesh_handle.id()) + fn get_batch_data( + mesh_instances: &SystemParamItem, + entity: &QueryItem, + ) -> (Self::BufferData, Option) { + let mesh_instance = mesh_instances + .get(entity) + .expect("Failed to find render mesh instance"); + ( + (&mesh_instance.transforms).into(), + mesh_instance.automatic_batching.then_some(( + mesh_instance.material_bind_group_id, + mesh_instance.mesh_asset_id, + )), + ) } } @@ -932,12 +982,12 @@ impl MeshBindGroups { /// Get the `BindGroup` for `GpuMesh` with given `handle_id`. pub fn get( &self, - handle_id: AssetId, + asset_id: AssetId, is_skinned: bool, morph: bool, ) -> Option<&BindGroup> { match (is_skinned, morph) { - (_, true) => self.morph_targets.get(&handle_id), + (_, true) => self.morph_targets.get(&asset_id), (true, false) => self.skinned.as_ref(), (false, false) => self.model_only.as_ref(), } @@ -1176,27 +1226,44 @@ impl RenderCommand

for SetMeshViewBindGroup pub struct SetMeshBindGroup; impl RenderCommand

for SetMeshBindGroup { - type Param = SRes; - type ViewWorldQuery = (); - type ItemWorldQuery = ( - Read>, - Option>, - Option>, + type Param = ( + SRes, + SRes, + SRes, + SRes, ); + type ViewWorldQuery = (); + type ItemWorldQuery = (); #[inline] fn render<'w>( item: &P, _view: (), - (mesh, skin_index, morph_index): ROQueryItem, - bind_groups: SystemParamItem<'w, '_, Self::Param>, + _item_query: (), + (bind_groups, mesh_instances, skin_indices, morph_indices): SystemParamItem< + 'w, + '_, + Self::Param, + >, pass: &mut TrackedRenderPass<'w>, ) -> RenderCommandResult { let bind_groups = bind_groups.into_inner(); + let mesh_instances = mesh_instances.into_inner(); + let skin_indices = skin_indices.into_inner(); + let morph_indices = morph_indices.into_inner(); + + let entity = &item.entity(); + + let Some(mesh) = mesh_instances.get(entity) else { + return RenderCommandResult::Success; + }; + let skin_index = skin_indices.get(entity); + let morph_index = morph_indices.get(entity); + let is_skinned = skin_index.is_some(); let is_morphed = morph_index.is_some(); - let Some(bind_group) = bind_groups.get(mesh.id(), is_skinned, is_morphed) else { + let Some(bind_group) = bind_groups.get(mesh.mesh_asset_id, is_skinned, is_morphed) else { error!( "The MeshBindGroups resource wasn't set in the render phase. \ It should be set by the queue_mesh_bind_group system.\n\ @@ -1227,43 +1294,50 @@ impl RenderCommand

for SetMeshBindGroup { pub struct DrawMesh; impl RenderCommand

for DrawMesh { - type Param = SRes>; + type Param = (SRes>, SRes); type ViewWorldQuery = (); - type ItemWorldQuery = Read>; + type ItemWorldQuery = (); #[inline] fn render<'w>( item: &P, _view: (), - mesh_handle: ROQueryItem<'_, Self::ItemWorldQuery>, - meshes: SystemParamItem<'w, '_, Self::Param>, + _item_query: (), + (meshes, mesh_instances): SystemParamItem<'w, '_, Self::Param>, pass: &mut TrackedRenderPass<'w>, ) -> RenderCommandResult { - if let Some(gpu_mesh) = meshes.into_inner().get(mesh_handle) { - let batch_range = item.batch_range(); - pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..)); - #[cfg(all(feature = "webgl", target_arch = "wasm32"))] - pass.set_push_constants( - ShaderStages::VERTEX, - 0, - &(batch_range.start as i32).to_le_bytes(), - ); - match &gpu_mesh.buffer_info { - GpuBufferInfo::Indexed { - buffer, - index_format, - count, - } => { - pass.set_index_buffer(buffer.slice(..), 0, *index_format); - pass.draw_indexed(0..*count, 0, batch_range.clone()); - } - GpuBufferInfo::NonIndexed => { - pass.draw(0..gpu_mesh.vertex_count, batch_range.clone()); - } + let meshes = meshes.into_inner(); + let mesh_instances = mesh_instances.into_inner(); + + let Some(mesh_instance) = mesh_instances.get(&item.entity()) else { + return RenderCommandResult::Failure; + }; + let Some(gpu_mesh) = meshes.get(mesh_instance.mesh_asset_id) else { + return RenderCommandResult::Failure; + }; + + pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..)); + + let batch_range = item.batch_range(); + #[cfg(all(feature = "webgl", target_arch = "wasm32"))] + pass.set_push_constants( + ShaderStages::VERTEX, + 0, + &(batch_range.start as i32).to_le_bytes(), + ); + match &gpu_mesh.buffer_info { + GpuBufferInfo::Indexed { + buffer, + index_format, + count, + } => { + pass.set_index_buffer(buffer.slice(..), 0, *index_format); + pass.draw_indexed(0..*count, 0, batch_range.clone()); + } + GpuBufferInfo::NonIndexed => { + pass.draw(0..gpu_mesh.vertex_count, batch_range.clone()); } - RenderCommandResult::Success - } else { - RenderCommandResult::Failure } + RenderCommandResult::Success } } diff --git a/crates/bevy_pbr/src/render/morph.rs b/crates/bevy_pbr/src/render/morph.rs index b39064c7f34ba..61dfef75d5280 100644 --- a/crates/bevy_pbr/src/render/morph.rs +++ b/crates/bevy_pbr/src/render/morph.rs @@ -1,5 +1,6 @@ use std::{iter, mem}; +use bevy_derive::{Deref, DerefMut}; use bevy_ecs::prelude::*; use bevy_render::{ batching::NoAutomaticBatching, @@ -9,16 +10,22 @@ use bevy_render::{ view::ViewVisibility, Extract, }; +use bevy_utils::EntityHashMap; use bytemuck::Pod; #[derive(Component)] pub struct MorphIndex { pub(super) index: u32, } + +#[derive(Default, Resource, Deref, DerefMut)] +pub struct MorphIndices(EntityHashMap); + #[derive(Resource)] pub struct MorphUniform { pub buffer: BufferVec, } + impl Default for MorphUniform { fn default() -> Self { Self { @@ -43,6 +50,7 @@ pub fn prepare_morphs( const fn can_align(step: usize, target: usize) -> bool { step % target == 0 || target % step == 0 } + const WGPU_MIN_ALIGN: usize = 256; /// Align a [`BufferVec`] to `N` bytes by padding the end with `T::default()` values. @@ -72,15 +80,13 @@ fn add_to_alignment(buffer: &mut BufferVec) { // Notes on implementation: see comment on top of the extract_skins system in skin module. // This works similarly, but for `f32` instead of `Mat4` pub fn extract_morphs( - mut commands: Commands, - mut previous_len: Local, + mut morph_indices: ResMut, mut uniform: ResMut, query: Extract>, ) { + morph_indices.clear(); uniform.buffer.clear(); - let mut values = Vec::with_capacity(*previous_len); - for (entity, view_visibility, morph_weights) in &query { if !view_visibility.get() { continue; @@ -92,10 +98,17 @@ pub fn extract_morphs( add_to_alignment::(&mut uniform.buffer); let index = (start * mem::size_of::()) as u32; - // NOTE: Because morph targets require per-morph target texture bindings, they cannot - // currently be batched. - values.push((entity, (MorphIndex { index }, NoAutomaticBatching))); + morph_indices.insert(entity, MorphIndex { index }); + } +} + +// NOTE: Because morph targets require per-morph target texture bindings, they cannot +// currently be batched. +pub fn no_automatic_morph_batching( + mut commands: Commands, + query: Query, Without)>, +) { + for entity in &query { + commands.entity(entity).insert(NoAutomaticBatching); } - *previous_len = values.len(); - commands.insert_or_spawn_batch(values); } diff --git a/crates/bevy_pbr/src/render/skin.rs b/crates/bevy_pbr/src/render/skin.rs index 871f504d3ebe2..bfb12fd794427 100644 --- a/crates/bevy_pbr/src/render/skin.rs +++ b/crates/bevy_pbr/src/render/skin.rs @@ -1,4 +1,5 @@ use bevy_asset::Assets; +use bevy_derive::{Deref, DerefMut}; use bevy_ecs::prelude::*; use bevy_math::Mat4; use bevy_render::{ @@ -10,6 +11,7 @@ use bevy_render::{ Extract, }; use bevy_transform::prelude::GlobalTransform; +use bevy_utils::EntityHashMap; /// Maximum number of joints supported for skinned meshes. pub const MAX_JOINTS: usize = 256; @@ -18,6 +20,7 @@ pub const MAX_JOINTS: usize = 256; pub struct SkinIndex { pub index: u32, } + impl SkinIndex { /// Index to be in address space based on [`SkinUniform`] size. const fn new(start: usize) -> Self { @@ -27,11 +30,15 @@ impl SkinIndex { } } +#[derive(Default, Resource, Deref, DerefMut)] +pub struct SkinIndices(EntityHashMap); + // Notes on implementation: see comment on top of the `extract_skins` system. #[derive(Resource)] pub struct SkinUniform { pub buffer: BufferVec, } + impl Default for SkinUniform { fn default() -> Self { Self { @@ -81,16 +88,14 @@ pub fn prepare_skins( // which normally only support fixed size arrays. You just have to make sure // in the shader that you only read the values that are valid for that binding. pub fn extract_skins( - mut commands: Commands, - mut previous_len: Local, + mut skin_indices: ResMut, mut uniform: ResMut, query: Extract>, inverse_bindposes: Extract>>, joints: Extract>, ) { uniform.buffer.clear(); - - let mut values = Vec::with_capacity(*previous_len); + skin_indices.clear(); let mut last_start = 0; // PERF: This can be expensive, can we move this to prepare? @@ -124,16 +129,23 @@ pub fn extract_skins( while buffer.len() % 4 != 0 { buffer.push(Mat4::ZERO); } - // NOTE: The skinned joints uniform buffer has to be bound at a dynamic offset per - // entity and so cannot currently be batched. - values.push((entity, (SkinIndex::new(start), NoAutomaticBatching))); + + skin_indices.insert(entity, SkinIndex::new(start)); } // Pad out the buffer to ensure that there's enough space for bindings while uniform.buffer.len() - last_start < MAX_JOINTS { uniform.buffer.push(Mat4::ZERO); } +} - *previous_len = values.len(); - commands.insert_or_spawn_batch(values); +// NOTE: The skinned joints uniform buffer has to be bound at a dynamic offset per +// entity and so cannot currently be batched. +pub fn no_automatic_skin_batching( + mut commands: Commands, + query: Query, Without)>, +) { + for entity in &query { + commands.entity(entity).insert(NoAutomaticBatching); + } } diff --git a/crates/bevy_pbr/src/wireframe.rs b/crates/bevy_pbr/src/wireframe.rs index b1be7a2ef5ccb..9566afffb9242 100644 --- a/crates/bevy_pbr/src/wireframe.rs +++ b/crates/bevy_pbr/src/wireframe.rs @@ -1,13 +1,15 @@ -use crate::{DrawMesh, MeshPipelineKey, SetMeshBindGroup, SetMeshViewBindGroup}; -use crate::{MeshPipeline, MeshTransforms}; +use crate::MeshPipeline; +use crate::{ + DrawMesh, MeshPipelineKey, RenderMeshInstance, RenderMeshInstances, SetMeshBindGroup, + SetMeshViewBindGroup, +}; use bevy_app::Plugin; use bevy_asset::{load_internal_asset, Handle}; use bevy_core_pipeline::core_3d::Opaque3d; +use bevy_derive::{Deref, DerefMut}; use bevy_ecs::{prelude::*, reflect::ReflectComponent}; use bevy_reflect::std_traits::ReflectDefault; use bevy_reflect::Reflect; -use bevy_render::extract_component::{ExtractComponent, ExtractComponentPlugin}; -use bevy_render::Render; use bevy_render::{ extract_resource::{ExtractResource, ExtractResourcePlugin}, mesh::{Mesh, MeshVertexBufferLayout}, @@ -20,7 +22,9 @@ use bevy_render::{ view::{ExtractedView, Msaa, VisibleEntities}, RenderApp, RenderSet, }; +use bevy_render::{Extract, ExtractSchedule, Render}; use bevy_utils::tracing::error; +use bevy_utils::EntityHashSet; pub const WIREFRAME_SHADER_HANDLE: Handle = Handle::weak_from_u128(192598014480025766); @@ -39,15 +43,14 @@ impl Plugin for WireframePlugin { app.register_type::() .register_type::() .init_resource::() - .add_plugins(( - ExtractResourcePlugin::::default(), - ExtractComponentPlugin::::default(), - )); + .add_plugins((ExtractResourcePlugin::::default(),)); if let Ok(render_app) = app.get_sub_app_mut(RenderApp) { render_app .add_render_command::() .init_resource::>() + .init_resource::() + .add_systems(ExtractSchedule, extract_wireframes) .add_systems(Render, queue_wireframes.in_set(RenderSet::QueueMeshes)); } } @@ -60,7 +63,7 @@ impl Plugin for WireframePlugin { } /// Controls whether an entity should rendered in wireframe-mode if the [`WireframePlugin`] is enabled -#[derive(Component, Debug, Clone, Default, ExtractComponent, Reflect)] +#[derive(Component, Debug, Clone, Default, Reflect)] #[reflect(Component, Default)] pub struct Wireframe; @@ -71,6 +74,17 @@ pub struct WireframeConfig { pub global: bool, } +#[derive(Resource, Default, Deref, DerefMut)] +pub struct Wireframes(EntityHashSet); + +fn extract_wireframes( + mut wireframes: ResMut, + query: Extract>>, +) { + wireframes.clear(); + wireframes.extend(&query); +} + #[derive(Resource, Clone)] pub struct WireframePipeline { mesh_pipeline: MeshPipeline, @@ -110,15 +124,13 @@ impl SpecializedMeshPipeline for WireframePipeline { fn queue_wireframes( opaque_3d_draw_functions: Res>, render_meshes: Res>, + render_mesh_instances: Res, + wireframes: Res, wireframe_config: Res, wireframe_pipeline: Res, mut pipelines: ResMut>, pipeline_cache: Res, msaa: Res, - mut material_meshes: ParamSet<( - Query<(Entity, &Handle, &MeshTransforms)>, - Query<(Entity, &Handle, &MeshTransforms), With>, - )>, mut views: Query<(&ExtractedView, &VisibleEntities, &mut RenderPhase)>, ) { let draw_custom = opaque_3d_draw_functions.read().id::(); @@ -127,10 +139,10 @@ fn queue_wireframes( let rangefinder = view.rangefinder3d(); let view_key = msaa_key | MeshPipelineKey::from_hdr(view.hdr); - let add_render_phase = |phase_item: (Entity, &Handle, &MeshTransforms)| { - let (entity, mesh_handle, mesh_transforms) = phase_item; + let add_render_phase = |phase_item: (Entity, &RenderMeshInstance)| { + let (entity, mesh_instance) = phase_item; - let Some(mesh) = render_meshes.get(mesh_handle) else { + let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else { return; }; let mut key = @@ -151,25 +163,36 @@ fn queue_wireframes( entity, pipeline: pipeline_id, draw_function: draw_custom, - distance: rangefinder.distance_translation(&mesh_transforms.transform.translation), + distance: rangefinder + .distance_translation(&mesh_instance.transforms.transform.translation), batch_range: 0..1, dynamic_offset: None, }); }; if wireframe_config.global { - let query = material_meshes.p0(); visible_entities .entities .iter() - .filter_map(|visible_entity| query.get(*visible_entity).ok()) + .filter_map(|visible_entity| { + render_mesh_instances + .get(visible_entity) + .map(|mesh_instance| (*visible_entity, mesh_instance)) + }) .for_each(add_render_phase); } else { - let query = material_meshes.p1(); visible_entities .entities .iter() - .filter_map(|visible_entity| query.get(*visible_entity).ok()) + .filter_map(|visible_entity| { + if wireframes.contains(visible_entity) { + render_mesh_instances + .get(visible_entity) + .map(|mesh_instance| (*visible_entity, mesh_instance)) + } else { + None + } + }) .for_each(add_render_phase); } } diff --git a/crates/bevy_render/src/batching/mod.rs b/crates/bevy_render/src/batching/mod.rs index 715402b2b4b16..16633b77b8a4f 100644 --- a/crates/bevy_render/src/batching/mod.rs +++ b/crates/bevy_render/src/batching/mod.rs @@ -1,8 +1,8 @@ use bevy_ecs::{ component::Component, prelude::Res, - query::{Has, QueryItem, ReadOnlyWorldQuery}, - system::{Query, ResMut}, + query::{QueryItem, ReadOnlyWorldQuery}, + system::{Query, ResMut, StaticSystemParam, SystemParam, SystemParamItem}, }; use bevy_utils::nonmax::NonMaxU32; @@ -56,7 +56,9 @@ impl BatchMeta { /// A trait to support getting data used for batching draw commands via phase /// items. pub trait GetBatchData { + type Param: SystemParam + 'static; type Query: ReadOnlyWorldQuery; + type QueryFilter: ReadOnlyWorldQuery; /// Data used for comparison between phase items. If the pipeline id, draw /// function id, per-instance data buffer dynamic offset and this data /// matches, the draws can be batched. @@ -65,10 +67,13 @@ pub trait GetBatchData { /// containing these data for all instances. type BufferData: GpuArrayBufferable + Sync + Send + 'static; /// Get the per-instance data to be inserted into the [`GpuArrayBuffer`]. - fn get_buffer_data(query_item: &QueryItem) -> Self::BufferData; - /// Get the data used for comparison when deciding whether draws can be - /// batched. - fn get_compare_data(query_item: &QueryItem) -> Self::CompareData; + /// If the instance can be batched, also return the data used for + /// comparison when deciding whether draws can be batched, else return None + /// for the `CompareData`. + fn get_batch_data( + param: &SystemParamItem, + query_item: &QueryItem, + ) -> (Self::BufferData, Option); } /// Batch the items in a render phase. This means comparing metadata needed to draw each phase item @@ -76,24 +81,23 @@ pub trait GetBatchData { pub fn batch_and_prepare_render_phase( gpu_array_buffer: ResMut>, mut views: Query<&mut RenderPhase>, - query: Query<(Has, F::Query)>, + query: Query, + param: StaticSystemParam, ) { let gpu_array_buffer = gpu_array_buffer.into_inner(); + let system_param_item = param.into_inner(); let mut process_item = |item: &mut I| { - let (no_auto_batching, batch_query_item) = query.get(item.entity()).ok()?; + let batch_query_item = query.get(item.entity()).ok()?; - let buffer_data = F::get_buffer_data(&batch_query_item); + let (buffer_data, compare_data) = F::get_batch_data(&system_param_item, &batch_query_item); let buffer_index = gpu_array_buffer.push(buffer_data); let index = buffer_index.index.get(); *item.batch_range_mut() = index..index + 1; *item.dynamic_offset_mut() = buffer_index.dynamic_offset; - (!no_auto_batching).then(|| { - let compare_data = F::get_compare_data(&batch_query_item); - BatchMeta::new(item, compare_data) - }) + compare_data.map(|compare_data| BatchMeta::new(item, compare_data)) }; for mut phase in &mut views { diff --git a/crates/bevy_sprite/src/mesh2d/material.rs b/crates/bevy_sprite/src/mesh2d/material.rs index 4b496c7242ac4..74f6073066b31 100644 --- a/crates/bevy_sprite/src/mesh2d/material.rs +++ b/crates/bevy_sprite/src/mesh2d/material.rs @@ -7,11 +7,7 @@ use bevy_core_pipeline::{ use bevy_derive::{Deref, DerefMut}; use bevy_ecs::{ prelude::*, - query::ROQueryItem, - system::{ - lifetimeless::{Read, SRes}, - SystemParamItem, - }, + system::{lifetimeless::SRes, SystemParamItem}, }; use bevy_log::error; use bevy_render::{ @@ -33,12 +29,12 @@ use bevy_render::{ Extract, ExtractSchedule, Render, RenderApp, RenderSet, }; use bevy_transform::components::{GlobalTransform, Transform}; -use bevy_utils::{FloatOrd, HashMap, HashSet}; +use bevy_utils::{EntityHashMap, FloatOrd, HashMap, HashSet}; use std::hash::Hash; use std::marker::PhantomData; use crate::{ - DrawMesh2d, Mesh2dHandle, Mesh2dPipeline, Mesh2dPipelineKey, Mesh2dTransforms, + DrawMesh2d, Mesh2dHandle, Mesh2dPipeline, Mesh2dPipelineKey, RenderMesh2dInstances, SetMesh2dBindGroup, SetMesh2dViewBindGroup, }; @@ -150,6 +146,7 @@ where .add_render_command::>() .init_resource::>() .init_resource::>() + .init_resource::>() .init_resource::>>() .add_systems( ExtractSchedule, @@ -176,24 +173,25 @@ where } } +#[derive(Resource, Deref, DerefMut)] +pub struct RenderMaterial2dInstances(EntityHashMap>); + +impl Default for RenderMaterial2dInstances { + fn default() -> Self { + Self(Default::default()) + } +} + fn extract_material_meshes_2d( - mut commands: Commands, - mut previous_len: Local, + mut material_instances: ResMut>, query: Extract)>>, ) { - let mut values = Vec::with_capacity(*previous_len); - for (entity, view_visibility, material) in &query { + material_instances.clear(); + for (entity, view_visibility, handle) in &query { if view_visibility.get() { - // NOTE: Material2dBindGroupId is inserted here to avoid a table move. Upcoming changes - // to use SparseSet for render world entity storage will do this automatically. - values.push(( - entity, - (material.clone_weak(), Material2dBindGroupId::default()), - )); + material_instances.insert(entity, handle.id()); } } - *previous_len = values.len(); - commands.insert_or_spawn_batch(values); } /// Render pipeline data for a given [`Material2d`] @@ -322,19 +320,29 @@ pub struct SetMaterial2dBindGroup(PhantomData) impl RenderCommand

for SetMaterial2dBindGroup { - type Param = SRes>; + type Param = ( + SRes>, + SRes>, + ); type ViewWorldQuery = (); - type ItemWorldQuery = Read>; + type ItemWorldQuery = (); #[inline] fn render<'w>( - _item: &P, + item: &P, _view: (), - material2d_handle: ROQueryItem<'_, Self::ItemWorldQuery>, - materials: SystemParamItem<'w, '_, Self::Param>, + _item_query: (), + (materials, material_instances): SystemParamItem<'w, '_, Self::Param>, pass: &mut TrackedRenderPass<'w>, ) -> RenderCommandResult { - let material2d = materials.into_inner().get(&material2d_handle.id()).unwrap(); + let materials = materials.into_inner(); + let material_instances = material_instances.into_inner(); + let Some(material_instance) = material_instances.get(&item.entity()) else { + return RenderCommandResult::Failure; + }; + let Some(material2d) = materials.get(material_instance) else { + return RenderCommandResult::Failure; + }; pass.set_bind_group(I, &material2d.bind_group, &[]); RenderCommandResult::Success } @@ -364,12 +372,8 @@ pub fn queue_material2d_meshes( msaa: Res, render_meshes: Res>, render_materials: Res>, - mut material2d_meshes: Query<( - &Handle, - &mut Material2dBindGroupId, - &Mesh2dHandle, - &Mesh2dTransforms, - )>, + mut render_mesh_instances: ResMut, + render_material_instances: Res>, mut views: Query<( &ExtractedView, &VisibleEntities, @@ -380,7 +384,7 @@ pub fn queue_material2d_meshes( ) where M::Data: PartialEq + Eq + Hash + Clone, { - if material2d_meshes.is_empty() { + if render_material_instances.is_empty() { return; } @@ -400,19 +404,16 @@ pub fn queue_material2d_meshes( } } for visible_entity in &visible_entities.entities { - let Ok(( - material2d_handle, - mut material2d_bind_group_id, - mesh2d_handle, - mesh2d_uniform, - )) = material2d_meshes.get_mut(*visible_entity) - else { + let Some(material_asset_id) = render_material_instances.get(visible_entity) else { continue; }; - let Some(material2d) = render_materials.get(&material2d_handle.id()) else { + let Some(mesh_instance) = render_mesh_instances.get_mut(visible_entity) else { continue; }; - let Some(mesh) = render_meshes.get(&mesh2d_handle.0) else { + let Some(material2d) = render_materials.get(material_asset_id) else { + continue; + }; + let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else { continue; }; let mesh_key = @@ -436,8 +437,9 @@ pub fn queue_material2d_meshes( } }; - *material2d_bind_group_id = material2d.get_bind_group_id(); - let mesh_z = mesh2d_uniform.transform.translation.z; + mesh_instance.material_bind_group_id = material2d.get_bind_group_id(); + + let mesh_z = mesh_instance.transforms.transform.translation.z; transparent_phase.add(Transparent2d { entity: *visible_entity, draw_function: draw_transparent_pbr, @@ -580,7 +582,7 @@ pub fn prepare_materials_2d( render_materials.remove(&removed); } - for (handle, material) in std::mem::take(&mut extracted_assets.extracted) { + for (asset_id, material) in std::mem::take(&mut extracted_assets.extracted) { match prepare_material2d( &material, &render_device, @@ -589,10 +591,10 @@ pub fn prepare_materials_2d( &pipeline, ) { Ok(prepared_asset) => { - render_materials.insert(handle, prepared_asset); + render_materials.insert(asset_id, prepared_asset); } Err(AsBindGroupError::RetryNextUpdate) => { - prepare_next_frame.assets.push((handle, material)); + prepare_next_frame.assets.push((asset_id, material)); } } } diff --git a/crates/bevy_sprite/src/mesh2d/mesh.rs b/crates/bevy_sprite/src/mesh2d/mesh.rs index 2717acd394d4e..1639b7df97f8d 100644 --- a/crates/bevy_sprite/src/mesh2d/mesh.rs +++ b/crates/bevy_sprite/src/mesh2d/mesh.rs @@ -2,6 +2,7 @@ use bevy_app::Plugin; use bevy_asset::{load_internal_asset, AssetId, Handle}; use bevy_core_pipeline::core_2d::Transparent2d; +use bevy_derive::{Deref, DerefMut}; use bevy_ecs::{ prelude::*, query::{QueryItem, ROQueryItem}, @@ -10,7 +11,10 @@ use bevy_ecs::{ use bevy_math::{Affine3, Vec2, Vec4}; use bevy_reflect::Reflect; use bevy_render::{ - batching::{batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData}, + batching::{ + batch_and_prepare_render_phase, write_batched_instance_buffer, GetBatchData, + NoAutomaticBatching, + }, globals::{GlobalsBuffer, GlobalsUniform}, mesh::{GpuBufferInfo, Mesh, MeshVertexBufferLayout}, render_asset::RenderAssets, @@ -26,6 +30,7 @@ use bevy_render::{ Extract, ExtractSchedule, Render, RenderApp, RenderSet, }; use bevy_transform::components::GlobalTransform; +use bevy_utils::EntityHashMap; use crate::Material2dBindGroupId; @@ -89,6 +94,7 @@ impl Plugin for Mesh2dRenderPlugin { if let Ok(render_app) = app.get_sub_app_mut(RenderApp) { render_app + .init_resource::() .init_resource::>() .add_systems(ExtractSchedule, extract_mesh2d) .add_systems( @@ -178,29 +184,58 @@ bitflags::bitflags! { } } +pub struct RenderMesh2dInstance { + pub transforms: Mesh2dTransforms, + pub mesh_asset_id: AssetId, + pub material_bind_group_id: Material2dBindGroupId, + pub automatic_batching: bool, +} + +#[derive(Default, Resource, Deref, DerefMut)] +pub struct RenderMesh2dInstances(EntityHashMap); + +#[derive(Component)] +pub struct Mesh2d; + pub fn extract_mesh2d( mut commands: Commands, mut previous_len: Local, - query: Extract>, + mut render_mesh_instances: ResMut, + query: Extract< + Query<( + Entity, + &ViewVisibility, + &GlobalTransform, + &Mesh2dHandle, + Has, + )>, + >, ) { - let mut values = Vec::with_capacity(*previous_len); - for (entity, view_visibility, transform, handle) in &query { + render_mesh_instances.clear(); + let mut entities = Vec::with_capacity(*previous_len); + + for (entity, view_visibility, transform, handle, no_automatic_batching) in &query { if !view_visibility.get() { continue; } - values.push(( + // FIXME: Remove this - it is just a workaround to enable rendering to work as + // render commands require an entity to exist at the moment. + entities.push((entity, Mesh2d)); + render_mesh_instances.insert( entity, - ( - Mesh2dHandle(handle.0.clone_weak()), - Mesh2dTransforms { + RenderMesh2dInstance { + transforms: Mesh2dTransforms { transform: (&transform.affine()).into(), flags: MeshFlags::empty().bits(), }, - ), - )); + mesh_asset_id: handle.0.id(), + material_bind_group_id: Material2dBindGroupId::default(), + automatic_batching: !no_automatic_batching, + }, + ); } - *previous_len = values.len(); - commands.insert_or_spawn_batch(values); + *previous_len = entities.len(); + commands.insert_or_spawn_batch(entities); } #[derive(Resource, Clone)] @@ -325,22 +360,26 @@ impl Mesh2dPipeline { } impl GetBatchData for Mesh2dPipeline { - type Query = ( - Option<&'static Material2dBindGroupId>, - &'static Mesh2dHandle, - &'static Mesh2dTransforms, - ); - type CompareData = (Option, AssetId); + type Param = SRes; + type Query = Entity; + type QueryFilter = With; + type CompareData = (Material2dBindGroupId, AssetId); type BufferData = Mesh2dUniform; - fn get_buffer_data(&(.., mesh_transforms): &QueryItem) -> Self::BufferData { - mesh_transforms.into() - } - - fn get_compare_data( - &(material_bind_group_id, mesh_handle, ..): &QueryItem, - ) -> Self::CompareData { - (material_bind_group_id.copied(), mesh_handle.0.id()) + fn get_batch_data( + mesh_instances: &SystemParamItem, + entity: &QueryItem, + ) -> (Self::BufferData, Option) { + let mesh_instance = mesh_instances + .get(entity) + .expect("Failed to find render mesh2d instance"); + ( + (&mesh_instance.transforms).into(), + mesh_instance.automatic_batching.then_some(( + mesh_instance.material_bind_group_id, + mesh_instance.mesh_asset_id, + )), + ) } } @@ -653,43 +692,52 @@ impl RenderCommand

for SetMesh2dBindGroup { pub struct DrawMesh2d; impl RenderCommand

for DrawMesh2d { - type Param = SRes>; + type Param = (SRes>, SRes); type ViewWorldQuery = (); - type ItemWorldQuery = Read; + type ItemWorldQuery = (); #[inline] fn render<'w>( item: &P, _view: (), - mesh_handle: ROQueryItem<'w, Self::ItemWorldQuery>, - meshes: SystemParamItem<'w, '_, Self::Param>, + _item_query: (), + (meshes, render_mesh2d_instances): SystemParamItem<'w, '_, Self::Param>, pass: &mut TrackedRenderPass<'w>, ) -> RenderCommandResult { + let meshes = meshes.into_inner(); + let render_mesh2d_instances = render_mesh2d_instances.into_inner(); + + let Some(RenderMesh2dInstance { mesh_asset_id, .. }) = + render_mesh2d_instances.get(&item.entity()) + else { + return RenderCommandResult::Failure; + }; + let Some(gpu_mesh) = meshes.get(*mesh_asset_id) else { + return RenderCommandResult::Failure; + }; + + pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..)); + let batch_range = item.batch_range(); - if let Some(gpu_mesh) = meshes.into_inner().get(&mesh_handle.0) { - pass.set_vertex_buffer(0, gpu_mesh.vertex_buffer.slice(..)); - #[cfg(all(feature = "webgl", target_arch = "wasm32"))] - pass.set_push_constants( - ShaderStages::VERTEX, - 0, - &(batch_range.start as i32).to_le_bytes(), - ); - match &gpu_mesh.buffer_info { - GpuBufferInfo::Indexed { - buffer, - index_format, - count, - } => { - pass.set_index_buffer(buffer.slice(..), 0, *index_format); - pass.draw_indexed(0..*count, 0, batch_range.clone()); - } - GpuBufferInfo::NonIndexed => { - pass.draw(0..gpu_mesh.vertex_count, batch_range.clone()); - } + #[cfg(all(feature = "webgl", target_arch = "wasm32"))] + pass.set_push_constants( + ShaderStages::VERTEX, + 0, + &(batch_range.start as i32).to_le_bytes(), + ); + match &gpu_mesh.buffer_info { + GpuBufferInfo::Indexed { + buffer, + index_format, + count, + } => { + pass.set_index_buffer(buffer.slice(..), 0, *index_format); + pass.draw_indexed(0..*count, 0, batch_range.clone()); + } + GpuBufferInfo::NonIndexed => { + pass.draw(0..gpu_mesh.vertex_count, batch_range.clone()); } - RenderCommandResult::Success - } else { - RenderCommandResult::Failure } + RenderCommandResult::Success } } diff --git a/crates/bevy_sprite/src/render/mod.rs b/crates/bevy_sprite/src/render/mod.rs index 2d5343a867adc..c7d79dcc088e8 100644 --- a/crates/bevy_sprite/src/render/mod.rs +++ b/crates/bevy_sprite/src/render/mod.rs @@ -11,7 +11,6 @@ use bevy_core_pipeline::{ }; use bevy_ecs::{ prelude::*, - storage::SparseSet, system::{lifetimeless::*, SystemParamItem, SystemState}, }; use bevy_math::{Affine3A, Quat, Rect, Vec2, Vec4}; @@ -34,7 +33,7 @@ use bevy_render::{ Extract, }; use bevy_transform::components::GlobalTransform; -use bevy_utils::{FloatOrd, HashMap}; +use bevy_utils::{EntityHashMap, FloatOrd, HashMap}; use bytemuck::{Pod, Zeroable}; use fixedbitset::FixedBitSet; @@ -330,7 +329,7 @@ pub struct ExtractedSprite { #[derive(Resource, Default)] pub struct ExtractedSprites { - pub sprites: SparseSet, + pub sprites: EntityHashMap, } #[derive(Resource, Default)] @@ -641,7 +640,7 @@ pub fn prepare_sprites( // Compatible items share the same entity. for item_index in 0..transparent_phase.items.len() { let item = &transparent_phase.items[item_index]; - let Some(extracted_sprite) = extracted_sprites.sprites.get(item.entity) else { + let Some(extracted_sprite) = extracted_sprites.sprites.get(&item.entity) else { // If there is a phase item that is not a sprite, then we must start a new // batch to draw the other phase item(s) and to respect draw order. This can be // done by invalidating the batch_image_handle diff --git a/crates/bevy_ui/src/render/mod.rs b/crates/bevy_ui/src/render/mod.rs index d146e53beb79d..f962298b9b1d9 100644 --- a/crates/bevy_ui/src/render/mod.rs +++ b/crates/bevy_ui/src/render/mod.rs @@ -2,7 +2,6 @@ mod pipeline; mod render_pass; use bevy_core_pipeline::{core_2d::Camera2d, core_3d::Camera3d}; -use bevy_ecs::storage::SparseSet; use bevy_hierarchy::Parent; use bevy_render::render_phase::PhaseItem; use bevy_render::view::ViewVisibility; @@ -36,7 +35,7 @@ use bevy_sprite::{SpriteAssetEvents, TextureAtlas}; #[cfg(feature = "bevy_text")] use bevy_text::{PositionedGlyph, Text, TextLayoutInfo}; use bevy_transform::components::GlobalTransform; -use bevy_utils::{FloatOrd, HashMap}; +use bevy_utils::{EntityHashMap, FloatOrd, HashMap}; use bytemuck::{Pod, Zeroable}; use std::ops::Range; @@ -164,7 +163,7 @@ pub struct ExtractedUiNode { #[derive(Resource, Default)] pub struct ExtractedUiNodes { - pub uinodes: SparseSet, + pub uinodes: EntityHashMap, } pub fn extract_atlas_uinodes( @@ -733,7 +732,7 @@ pub fn prepare_uinodes( for item_index in 0..ui_phase.items.len() { let item = &mut ui_phase.items[item_index]; - if let Some(extracted_uinode) = extracted_uinodes.uinodes.get(item.entity) { + if let Some(extracted_uinode) = extracted_uinodes.uinodes.get(&item.entity) { let mut existing_batch = batches .last_mut() .filter(|_| batch_image_handle == extracted_uinode.image); diff --git a/crates/bevy_utils/src/lib.rs b/crates/bevy_utils/src/lib.rs index 52f33f31d7dc6..85e70e60bc541 100644 --- a/crates/bevy_utils/src/lib.rs +++ b/crates/bevy_utils/src/lib.rs @@ -250,6 +250,58 @@ impl PreHashMapExt for PreHashMap Self::Hasher { + EntityHasher::default() + } +} + +/// A very fast hash that is only designed to work on generational indices +/// like `Entity`. It will panic if attempting to hash a type containing +/// non-u64 fields. +#[derive(Debug, Default)] +pub struct EntityHasher { + hash: u64, +} + +// This value comes from rustc-hash (also known as FxHasher) which in turn got +// it from Firefox. It is something like `u64::MAX / N` for an N that gives a +// value close to π and works well for distributing bits for hashing when using +// with a wrapping multiplication. +const FRAC_U64MAX_PI: u64 = 0x517cc1b727220a95; + +impl Hasher for EntityHasher { + fn write(&mut self, _bytes: &[u8]) { + panic!("can only hash u64 using EntityHasher"); + } + + #[inline] + fn write_u64(&mut self, i: u64) { + // Apparently hashbrown's hashmap uses the upper 7 bits for some SIMD + // optimisation that uses those bits for binning. This hash function + // was faster than i | (i << (64 - 7)) in the worst cases, and was + // faster than PassHasher for all cases tested. + self.hash = i | (i.wrapping_mul(FRAC_U64MAX_PI) << 32); + } + + #[inline] + fn finish(&self) -> u64 { + self.hash + } +} + +/// A [`HashMap`] pre-configured to use [`EntityHash`] hashing. +pub type EntityHashMap = hashbrown::HashMap; + +/// A [`HashSet`] pre-configured to use [`EntityHash`] hashing. +pub type EntityHashSet = hashbrown::HashSet; + /// A type which calls a function when dropped. /// This can be used to ensure that cleanup code is run even in case of a panic. /// diff --git a/examples/shader/shader_instancing.rs b/examples/shader/shader_instancing.rs index d5e751ae0fa1d..fc18bb63c73b7 100644 --- a/examples/shader/shader_instancing.rs +++ b/examples/shader/shader_instancing.rs @@ -6,7 +6,9 @@ use bevy::{ query::QueryItem, system::{lifetimeless::*, SystemParamItem}, }, - pbr::{MeshPipeline, MeshPipelineKey, MeshTransforms, SetMeshBindGroup, SetMeshViewBindGroup}, + pbr::{ + MeshPipeline, MeshPipelineKey, RenderMeshInstances, SetMeshBindGroup, SetMeshViewBindGroup, + }, prelude::*, render::{ extract_component::{ExtractComponent, ExtractComponentPlugin}, @@ -113,7 +115,8 @@ fn queue_custom( mut pipelines: ResMut>, pipeline_cache: Res, meshes: Res>, - material_meshes: Query<(Entity, &MeshTransforms, &Handle), With>, + render_mesh_instances: Res, + material_meshes: Query>, mut views: Query<(&ExtractedView, &mut RenderPhase)>, ) { let draw_custom = transparent_3d_draw_functions.read().id::(); @@ -123,23 +126,26 @@ fn queue_custom( for (view, mut transparent_phase) in &mut views { let view_key = msaa_key | MeshPipelineKey::from_hdr(view.hdr); let rangefinder = view.rangefinder3d(); - for (entity, mesh_transforms, mesh_handle) in &material_meshes { - if let Some(mesh) = meshes.get(mesh_handle) { - let key = - view_key | MeshPipelineKey::from_primitive_topology(mesh.primitive_topology); - let pipeline = pipelines - .specialize(&pipeline_cache, &custom_pipeline, key, &mesh.layout) - .unwrap(); - transparent_phase.add(Transparent3d { - entity, - pipeline, - draw_function: draw_custom, - distance: rangefinder - .distance_translation(&mesh_transforms.transform.translation), - batch_range: 0..1, - dynamic_offset: None, - }); - } + for entity in &material_meshes { + let Some(mesh_instance) = render_mesh_instances.get(&entity) else { + continue; + }; + let Some(mesh) = meshes.get(mesh_instance.mesh_asset_id) else { + continue; + }; + let key = view_key | MeshPipelineKey::from_primitive_topology(mesh.primitive_topology); + let pipeline = pipelines + .specialize(&pipeline_cache, &custom_pipeline, key, &mesh.layout) + .unwrap(); + transparent_phase.add(Transparent3d { + entity, + pipeline, + draw_function: draw_custom, + distance: rangefinder + .distance_translation(&mesh_instance.transforms.transform.translation), + batch_range: 0..1, + dynamic_offset: None, + }); } } } @@ -238,19 +244,22 @@ type DrawCustom = ( pub struct DrawMeshInstanced; impl RenderCommand

for DrawMeshInstanced { - type Param = SRes>; + type Param = (SRes>, SRes); type ViewWorldQuery = (); - type ItemWorldQuery = (Read>, Read); + type ItemWorldQuery = Read; #[inline] fn render<'w>( - _item: &P, + item: &P, _view: (), - (mesh_handle, instance_buffer): (&'w Handle, &'w InstanceBuffer), - meshes: SystemParamItem<'w, '_, Self::Param>, + instance_buffer: &'w InstanceBuffer, + (meshes, render_mesh_instances): SystemParamItem<'w, '_, Self::Param>, pass: &mut TrackedRenderPass<'w>, ) -> RenderCommandResult { - let gpu_mesh = match meshes.into_inner().get(mesh_handle) { + let Some(mesh_instance) = render_mesh_instances.get(&item.entity()) else { + return RenderCommandResult::Failure; + }; + let gpu_mesh = match meshes.into_inner().get(mesh_instance.mesh_asset_id) { Some(gpu_mesh) => gpu_mesh, None => return RenderCommandResult::Failure, }; diff --git a/examples/stress_tests/bevymark.rs b/examples/stress_tests/bevymark.rs index cd8c1ad77f3c2..34112ebd7e0c9 100644 --- a/examples/stress_tests/bevymark.rs +++ b/examples/stress_tests/bevymark.rs @@ -97,7 +97,8 @@ fn main() { DefaultPlugins.set(WindowPlugin { primary_window: Some(Window { title: "BevyMark".into(), - resolution: (800., 600.).into(), + resolution: WindowResolution::new(1920.0, 1080.0) + .with_scale_factor_override(1.0), present_mode: PresentMode::AutoNoVsync, ..default() }), diff --git a/examples/stress_tests/many_cubes.rs b/examples/stress_tests/many_cubes.rs index 218757e471c79..740b9bbcc7659 100644 --- a/examples/stress_tests/many_cubes.rs +++ b/examples/stress_tests/many_cubes.rs @@ -16,7 +16,7 @@ use bevy::{ math::{DVec2, DVec3}, prelude::*, render::render_resource::{Extent3d, TextureDimension, TextureFormat}, - window::{PresentMode, WindowPlugin}, + window::{PresentMode, WindowPlugin, WindowResolution}, }; use rand::{rngs::StdRng, seq::SliceRandom, Rng, SeedableRng}; @@ -70,6 +70,8 @@ fn main() { DefaultPlugins.set(WindowPlugin { primary_window: Some(Window { present_mode: PresentMode::AutoNoVsync, + resolution: WindowResolution::new(1920.0, 1080.0) + .with_scale_factor_override(1.0), ..default() }), ..default()