Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vertex and attribute compression #81138

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

clayjohn
Copy link
Member

@clayjohn clayjohn commented Aug 29, 2023

This scheme is backwards compatible so old meshes will continue to work.

Users can get a performance boost by reimporting assets

The rationale behind this PR can be found here: https://files.godot.foundation/s/259N9YQEPcBZwMx

In short, the main reason behind making this PR is bandwidth reduction (passing less data around the GPU per draw call). Bandwidth reduction is valuable in two cases:

  1. If the scene is bandwidth limited, reducing bandwidth can improve performance
  2. Reducing bandwidth reduces power consumption on mobile and power-constrained devices.

From our perspective, reducing bandwidth also has the added benefit that we can now confirm that we are not generally bandwidth bound (i.e. we aren't boundwidth bound in all scenes) which means we can focus our next performance improvements elsewhere (VGPRs / instructions).

Usage notes

After merging this PR all meshes will automatically start using a slightly different vertex attribute layout:

P = Vertex position
N = normal
T = tangent

// Before this PR
PNTPNTPNTPNT

// After this PR
PPPPNTNTNTNT

This is the recommended layout for mobile devices as it means that shadow passes and tiling passes only ever have to read vertex positions and it doesn't hurt performance on Desktop.

The renderer will automatically use this layout even if the mesh doesn't. To accommodate that, I have added a mesh format version number so we can perform automatic upgrades between versions. This avoids the issue we had last summer when integrating octahedral compression. Three things to note about this though:

  1. When running scenes with the old mesh format, users will get warnings for each mesh that needs to be upgraded
  2. Meshes need to save the new mesh format to stop getting the warning. In some cases this means they will have to reimport the mesh. Upon saving a mesh though, the new format will be used.
  3. Once the new format is used, users can't go back to an earlier version without re-importing their mesh.

In addition to the vertex layout change, we also introduced optional compression for vertices, normals, tangents, and UVs. The importers will automatically detect if the mesh is a candidate for compression and apply the appropriate flags. For UVs, we can compress them as long as they are all within the 0-1 range. While this is technically a lossy compression, there won't be a visible difference on textures with a size smaller than 65536 pixels, so it is safe to apply). For vertices we do two things:

  1. Scale the vertices into the 0-1 range using the surface's AABB (this allows us to store the meshes in RGB16 instead of RGB32
  2. Store Normal (vec3) + Tangent (vec4) as Axis (vec3)+ Angle (float). The Axis replaces the normal and tangent fits into the A channel of the vertex.

I implemented this compression scheme in a way that does not require additional shader variants, so there will be no effect on shader compile time. However, this means that there are more calculations in the shader, and VGPR usage may be affected on some compilers.

When the compression flags are not used, the vertex format remains the same as it was before this PR.

Performance

So far I have tested on my laptop using an integrated GPU and an NVidia 3050m. There is no difference when using the forward+ renderer, a slight improvement when using the compatibility renderer, and about a 10% improvement when using the mobile renderer. This is likely due to the fact that I had to refactor the mobile renderer to pass the necessary data.

On the TPS demo on integrated graphics, I am seeing an improvement from ~17 FPS to ~20FPS (Forward+).
Using the Synty Vikings demo When using the mobile renderer, I get an increase from 40 ms per frame to 37 ms per frame.

Running the TPS demo on mobile (Pixel 7 and Pixel 1) I see no performance difference (likely because both are instruction bound, not bandwidth bound).

More performance testing is needed before merging. But so far, in all cases the performance is either the exact same or slightly better.

Todos:

  1. Add additional optimization by de-interleaving vertex positions (this won't be backwards compatible and so we need to add version identifiers to mesh data
  2. Add a setting to force disable optimization on an imported mesh. Generally, the auto-detection works quite well. But it relies on the mesh having both proper normals and tangents. If the mesh has improper tangents, the result is also bad. In the end this setting wasn't needed. The issue came from a scene with non-continuous UVs, so the generated tangents were bogus. The solution was to allow GLTFs to respect the "ensure tangents" setting. Previously tangents were always generated if not supplied.
  3. more performance testing with heavier scenes on mobile devices (especially scenes with more shadows).

@clayjohn clayjohn changed the title Vertex and attribute compression working. Vertex and attribute compression Aug 30, 2023
drivers/gles3/shaders/skeleton.glsl Outdated Show resolved Hide resolved
scene/resources/importer_mesh.cpp Outdated Show resolved Hide resolved
scene/resources/importer_mesh.cpp Outdated Show resolved Hide resolved
servers/rendering_server.cpp Outdated Show resolved Hide resolved
@clayjohn
Copy link
Member Author

clayjohn commented Sep 1, 2023

Pushed an update to address comments and CI issues. Should be ready for substantive review now.

@clayjohn clayjohn marked this pull request as ready for review September 1, 2023 12:57
@clayjohn clayjohn requested review from a team as code owners September 1, 2023 12:57
@clayjohn clayjohn requested a review from a team as a code owner September 5, 2023 14:54
@clayjohn
Copy link
Member Author

clayjohn commented Sep 5, 2023

Thanks for the reviews AThousandShips!

Just pushed another version with docs and fixes for CI. This PR should be ready to go now. I don't expect to find any performance regressions in my final testing.

One catch, I am away on holidays from September 7 to 19. So if we merge this anytime soon I won't be around to jump on bugs. So we may want to merge this the week of the 18th so that I can be available to fix any bugs that arise. While I have done my best to test extensively, it is inevitable that bugs will appear from such a large change to a fundamental system

@akien-mga
Copy link
Member

Changing gltf_document.cpp caused it to be recompiled, and a new MSVC warning was found (not caused by this PR but could be fixed here too to pass CI):

Error: modules\gltf\gltf_document.cpp(2958): error C2220: the following warning is treated as an error
Warning: modules\gltf\gltf_document.cpp(2958): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)

I think this kind of error can be fixed by changing 1 << l to 1U << l.

@clayjohn clayjohn force-pushed the attribute-compression branch 3 times, most recently from d5cdf86 to dca3376 Compare September 6, 2023 09:01
@fire
Copy link
Member

fire commented Sep 6, 2023

I’ll take a glance.

@clayjohn clayjohn force-pushed the attribute-compression branch 4 times, most recently from 5984030 to d16ca71 Compare September 6, 2023 20:22
@clayjohn clayjohn requested a review from a team as a code owner September 6, 2023 20:22
@raulsntos
Copy link
Member

Sure but this is exclusively C# -> C# thing. I can only see it being a problem if you have a .net DLL using another .net DLL. This seems a little wierd for me, still you fix it simply rebuilding both things. For the C# -> C++ side of thing nothing changes since it should be as if this change does not exist for C#.

Yes, but you may not be able to rebuild everything. If the user is consuming a library from NuGet, then they can't rebuild that library, they can only wait for the library to be updated. Updating the library will make it incompatible with older versions of Godot, so the author may not want to do that.

We did not need to add a compatibility method.

Maybe not for C++, but C# still needs it. If the compat method won't be registered, then we need to add a C# method manually to Compat.cs and that's more work for contributors (most may not be familiar with C#).

@YuriSizov
Copy link
Contributor

YuriSizov commented Sep 29, 2023

Maybe not for C++, but C# still needs it. If the compat method won't be registered, then we need to add a C# method manually to Compat.cs and that's more work for contributors (most may not be familiar with C#).

I think this may be acceptable for now, since we don't want to block this PR on an infrastructural issue. So if we can work around it with a Compat.cs method, I think this is what we should do, and hope to resolve this later.

@raulsntos Could you give us a diff of what a Compat.cs patch may look like?

@raulsntos
Copy link
Member

This diff adds compat methods for ImporterMesh::add_surface and SurfaceTool::commit. The other methods can't have compat overloads because the only thing that changes is the return type, so we'll break compat there.

diff --git a/modules/mono/glue/GodotSharp/GodotSharp/Compat.cs b/modules/mono/glue/GodotSharp/GodotSharp/Compat.cs
index 48b47b166a8..aa03c3f97f1 100644
--- a/modules/mono/glue/GodotSharp/GodotSharp/Compat.cs
+++ b/modules/mono/glue/GodotSharp/GodotSharp/Compat.cs
@@ -44,6 +44,16 @@ partial class Geometry3D
     }
 }
 
+partial class ImporterMesh
+{
+    /// <inheritdoc cref="AddSurface(Mesh.PrimitiveType, Godot.Collections.Array, Godot.Collections.Array{Godot.Collections.Array}, Godot.Collections.Dictionary, Material, string, ulong)"/>
+    [EditorBrowsable(EditorBrowsableState.Never)]
+    public void AddSurface(Mesh.PrimitiveType primitive, Godot.Collections.Array arrays, Godot.Collections.Array<Godot.Collections.Array> blendShapes, Godot.Collections.Dictionary lods, Material material, string name, uint flags)
+    {
+        AddSurface(primitive, arrays, blendShapes, lods, material, name, (ulong)flags);
+    }
+}
+
 partial class MeshInstance3D
 {
     /// <inheritdoc cref="CreateMultipleConvexCollisions(MeshConvexDecompositionSettings)"/>
@@ -106,6 +116,13 @@ partial class SurfaceTool
     {
         AddTriangleFan(vertices, uvs, colors, uv2S, normals, new Godot.Collections.Array<Plane>(tangents));
     }
+
+    /// <inheritdoc cref="Commit(ArrayMesh, ulong)"/>
+    [EditorBrowsable(EditorBrowsableState.Never)]
+    public ArrayMesh Commit(ArrayMesh existing, uint flags)
+    {
+        return Commit(existing, (ulong)flags);
+    }
 }
 
 partial class Tree

@clayjohn clayjohn requested a review from a team as a code owner October 4, 2023 17:11
@clayjohn clayjohn force-pushed the attribute-compression branch 5 times, most recently from a77075b to 0534b28 Compare October 5, 2023 03:09
@clayjohn
Copy link
Member Author

clayjohn commented Oct 5, 2023

Updated to:

  1. combine compression options into one
  2. rebase on master
  3. remove compatibility methods where the only change is uint32_t->uint64_t
  4. add C# compatibility methods

As far as I know, this should be good once I squash the commits. To keep things easy to review (and avoid accidentally deleting necessary compat methods) I have left this in multiple PRs

@raulsntos can you confirm that the C# compat looks acceptable now?

@dsnopek Can you confirm again that the GDExtension compatibility is still fine?

Copy link
Contributor

@dsnopek dsnopek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at just the GDExtension compatibility bits, this looks good to me!

Copy link
Member

@raulsntos raulsntos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C# compat look good to me.

…mat.

This allows Godot to automatically compress meshes to save a lot of bandwidth.

In general, this requires no interaction from the user and should result in
no noticable quality loss.

This scheme is not backwards compatible, so we have provided an upgrade
mechanism, and a mesh versioning mechanism.

Existing meshes can still be used as a result, but users can get a
performance boost by reimporting assets.
@clayjohn
Copy link
Member Author

clayjohn commented Oct 5, 2023

This should be ready to go now. I have tested extensively including:

  1. TPS Demo
  2. Large Synty demo scene
  3. Various GLTfs including those with blend shapes, skeletons, etc.
  4. Bistro scene

And with all three renderers. In the current state I haven't caught any issues to be aware of.

One big thing to make users aware of is that we are switching to a new mesh format version. This means that when they run existing scenes they will get a warning that their mesh has been upgraded to a new version and then need to re-save or re-import the mesh.

@akien-mga akien-mga merged commit f02695c into godotengine:master Oct 5, 2023
15 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

10 participants