Skip to content

optimize the clear head method from SetData to a CS kernel #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 18, 2022

Conversation

SnowWindSaveYou
Copy link
Collaborator

it can improve the performance in most scenarios,
but still an limitation,
it can't work properly on a very large screen (e.g. 4K HUD 3840*2160) due to the buffer counts over the max thread group.
but it can be fixed by split thread size into many dispatches with an index offset.
original
optimized

@happy-turtle
Copy link
Owner

Thank you! Wow, that really is a huge performance boost.
Can we maybe prevent the maximum screen size by using a two (or later even three-dimensional) number of threads? And then offset by the second dimension. I will try to explain in code comments.

@SnowWindSaveYou
Copy link
Collaborator Author

Thank you! Wow, that really is a huge performance boost. Can we maybe prevent the maximum screen size by using a two (or later even three-dimensional) number of threads? And then offset by the second dimension. I will try to explain in code comments.

i fixed it by using a for loop with buffersize in compute shader.

@@ -16,7 +16,7 @@ RWByteAddressBuffer StartOffsetBuffer : register(u2);

void createFragmentEntry(float4 col, float3 pos, uint uCoverage) {
//Retrieve current Pixel count and increase counter
uint uPixelCount = FLBuffer.IncrementCounter();
uint uPixelCount = FLBuffer.IncrementCounter()+1;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i leave first one empty for safer debug, otherwise it's easy to make a infinite loop and make my computer crash

@happy-turtle
Copy link
Owner

happy-turtle commented Sep 15, 2022

Ah nice, I would still like to try a two-dimensional number of threads though. It would remove those big numbers. I can look at it at the next opportunity

@happy-turtle happy-turtle merged commit a590df9 into happy-turtle:main Sep 18, 2022
@happy-turtle
Copy link
Owner

Thanks a lot for this! The whole process is a lot more performant now. I added two-dimensional thread groups and did some cleanup to align more with the other shaders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants