-
-
Notifications
You must be signed in to change notification settings - Fork 21.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve glBufferSubData usage where safe #33527
Improve glBufferSubData usage where safe #33527
Conversation
842134a
to
1253a33
Compare
@@ -3099,7 +3099,7 @@ void RasterizerStorageGLES3::_update_material(Material *material) { | |||
} | |||
|
|||
glBindBuffer(GL_UNIFORM_BUFFER, material->ubo_id); | |||
glBufferSubData(GL_UNIFORM_BUFFER, 0, material->ubo_size, local_ubo); | |||
glBufferData(GL_UNIFORM_BUFFER, material->ubo_size, local_ubo, GL_STATIC_DRAW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this is the only call using GL_STATIC_DRAW
instead of GL_DYNAMIC_DRAW
, is this on purpose or an oversight?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On purpose. The buffer was allocated a few lines above with GL_STATIC_DRAW. We match the access type at the time of creation. This is only a hint anyway.
This is a strange optimization, but if it works faster I guess its fine. Does it make sense only on GL or would desktop also take advantage of it too? |
@reduz the benefit only applies to iOS and some Android devices. On desktop there is no known benefit. I prefer to keep it in the ifdefs because of the risk of this interacting weird with bad drivers. |
Let's give it a spin in 3.2 beta 2 today and see whether users report any regression on GLES/WebGL. |
Thanks! |
Great. Thanks to you -- this makes a huge difference on mobile. On desktop, the improvement is barely noticeable so I share @clayjohn conservatism - I had to VSync off to go beyong the 60fps mark and I get around 900-950 fps in both cases (Win 10, OpenGL ES 2.0 Renderer: Quadro K2100M/PCIe/SSE2). |
I may be a couple of decades out of date on this, but if you are getting pipeline stalls as a result of trying to write to data that is still being used, an alternative is to create 3 buffers and use them in round robin fashion. That said, if you mark a buffer as dynamic you would assume it was doing this behind the scenes, but maybe not on some drivers. https://www.bfilipek.com/2015/01/persistent-mapped-buffers-in-opengl.html |
@lawnjelly You are absolutely right. Good pointer - thanks. I think we got 3 problems here in decreasing priority:
|
It sounds as though glBufferData is avoiding the stall because it is effectively saying 'I don't care about the old contents of the buffer, allocate me a completely new buffer'. So internally it is likely using the same amount of memory internally as if you explicitly allocated these yourself, so it may not be 'saving memory' versus 2 / 3 buffers as much as expected. Ok, you in theory you can save a bit by only allocating what is needed, but on the other hand this is traded off against the risk of potentially thrashing memory around to achieve this variable size allocation (or an allocation failure worse case!). So deciding on a maximum (or at least only changing it infrequently) would probably be my preference, as it is constant very small time. This is the usual divergence between general app memory allocation schemes and game allocation schemes. The optimal size to use sounds doable, via the methods you suggest. |
Agreed again. We are definitely not 'saving memory' by replacing |
Fixes: #23956
Special thanks to @oeleo1 for testing out and identifying the best candidates.
Tagged as ios, but this may benefit other platforms as well as certain drivers handle glBufferData better than glBufferSubData.
There may be other places where this optimization may be appropriate, however, these are the places @oeleo1 identified as being safest (minus a couple that I removed). For more information about the rationale see this excellent comment #23956 (comment)
This PR now also has the changes from #23956 (comment). This uses a glBufferData() orphaning trick to vastly speed up the usage of glBufferSubData on mobile and webgl