Skip to content

Disable GPU threaded optimizations option (GL debug) #106589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented May 19, 2025

We can disable indirectly by enabling verbose GL debug output.

Forward port of #106556

PR on hold for now as it requires testing and refinement with a user that has hardware which exhibits the problem in order to proceed.

Discussion

A long standing issue on Windows with Nvidia drivers has been the threaded optimization (T.O.) driver setting. This seems to cause significant stuttering for some users.

Godot 4 tries the approach of using the Nvidia SDK to turn off the optimization directly in the driver (#71472), however that has caused a number of regressions (e.g. #85111).

Instead users have reported that enabling OpenGL debug logging seems to fix the issue, believed to be by forcing the driver to disable the feature. This is not ideal for exports, because Godot does significant processing to print these debug logs.

Here we add a similar feature which turns on debug logging, but instead provides a pass through so that no processing takes place on Godot side, minimizing the cost of the technique.

Switching on and interactions with Nvidia specific code

This will probably take some discussion and bikeshedding as there are a number of ways of doing this, but after some testing of alternatives I'm currently most in favour of this approach:

  • Use the existing nvidia_disable_threaded_optimization as a master switch
  • Add a new tune_nvidia_driver setting

This latter setting determines two things:

  1. Whether the Nvidia driver is accessed at all (to change gsync & T.O.)
  2. Whether T.O. is disabled (using the driver directly, or indirectly via debug logging, determined by (1))

In the course of developing these PRs I also tried alternative approaches, like having an independent switch for nvidia disable and debug log disable. The problem here is that the nvidia code also controls gsync, and thus regressions with the driver occur if it is accessed at all (as I understand it). So the obvious options are 3 settings (disable nvidia, gsync nvidia and disable logging, with descriptions of the combinations possible) or these two presented in this PR. So far these two seem like the best compromise (albeit being a little tricky to explain).

Notes

  • Only disables for video adapter with "nvidia" in the name, on windows only.
  • I can't fully test this, as I don't have windows or nvidia GPU, so it would be good if a user with the problem can test the artifacts from this PR.
  • We might not need the full complement of debug logging in order to trick the driver into disabling T.O.. It might be possible to call a subset of the commands, or toggle the logging on then off. Unfortunately I can't test this as I don't have the required hardware but would welcome testing / modifications / follow up PRs if we can minimize the logging requirements further.
  • verbose_logging takes precedence, but by definition if that is on, it will also disable threaded optimization. So probably no need to mention it in the docs.
  • Added a command line switch --allow-gdriver-to to override the project setting just in case an end user wants to play a game with multithread optimizations enabled. This also works for the nvidia driver method.

@lawnjelly lawnjelly added this to the 4.5 milestone May 19, 2025
@lawnjelly lawnjelly requested review from a team as code owners May 19, 2025 08:41
@lawnjelly lawnjelly force-pushed the disable_gpu_threaded_opt4 branch 3 times, most recently from 7ab7b4c to 40cceaf Compare May 19, 2025 10:39
@lawnjelly lawnjelly marked this pull request as draft May 19, 2025 11:28
@lawnjelly lawnjelly force-pushed the disable_gpu_threaded_opt4 branch from 40cceaf to 1601355 Compare May 19, 2025 11:55
@lawnjelly lawnjelly marked this pull request as ready for review May 19, 2025 13:05
@@ -2883,6 +2883,12 @@
<member name="rendering/gl_compatibility/nvidia_disable_threaded_optimization" type="bool" setter="" getter="" default="true">
If [code]true[/code], disables the threaded optimization feature from the NVIDIA drivers, which are known to cause stuttering in most OpenGL applications.
[b]Note:[/b] This setting only works on Windows, as threaded optimization is disabled by default on other platforms.
[b]Note:[/b] If [member rendering/gl_compatibility/tune_nvidia_driver] is set to [code]true[/code] the feature will be disabled directly in the driver.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This essentially means this setting doesn't have an effect if rendering/gl_compatibility/tune_nvidia_driver if true, no? What does it entail to the user?

Copy link
Member Author

@lawnjelly lawnjelly May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's what this PR is for. This PR can disable threaded optimization indirectly via OpenGL logging.

As I understand it, threaded optimization is a feature that may be:

  • Off
  • Auto
  • On

With settings per app set in the driver settings.

We offer options in Godot primarily to try and disable T.O. if they are on (as they can cause stutter).
We have two ways of doing this:

  • By accessing the Nvidia driver
  • By turning on OpenGL logging (which has the side effect of disabling)

It is kind of difficult to get your head around (and explain in the docs). Alternatively we could be more concise and offer up less info, and leave it to users to investigate how this works. 🤔

@lawnjelly lawnjelly force-pushed the disable_gpu_threaded_opt4 branch 4 times, most recently from d26d1e5 to 5b658b8 Compare May 19, 2025 16:02
We can disable indirectly by enabling verbose GL debug output.
@lawnjelly lawnjelly force-pushed the disable_gpu_threaded_opt4 branch from 5b658b8 to 7c72557 Compare May 19, 2025 16:10
@KeyboardDanni
Copy link
Contributor

Even if the debug output isn't getting piped anywhere, there could be overhead from running all the debug checks on the driver side. We should probably do a performance comparison between the driver profile method and this PR to see if there are any performance regressions.

I know it's the Compatibility renderer, but many of us have good reasons to use it as the main renderer instead of a fallback. And GL draw call overhead is bad enough as it is without debug checks.

@lawnjelly
Copy link
Member Author

Even if the debug output isn't getting piped anywhere, there could be overhead from running all the debug checks on the driver side. We should probably do a performance comparison between the driver profile method and this PR to see if there are any performance regressions.

Sure, that's why it's optional.
The point here is not performance (versus the driver), it's to avoid the regressions involved with trying to access the driver.

Essentially when I started porting the driver access code to 3.x, @akien-mga was pretty skeptical as afaik it's turned out to be a minefield for regressions. Hence why I'm keen to lead with the debug logging approach on 3.x. We already have users switching on verbose_logging to try and make releases in 3.x, so this is clearly a problem that needs solving, and I'm just aiming to make it more efficient.

For 4.x, this PR is just a forward port as a courtesy. It's up to you guys whether you want to struggle with the driver access approach or try out this simpler version.

@KeyboardDanni
Copy link
Contributor

To be clear, I'm also not a fan of the driver profile approach because it's basically polluting the user's machine with lots of different app profiles. I also wasn't aware that folks were shipping with the debug logging workaround, so if this works for devs already, I think that's pretty cool.

It's just kind of a weird way to tackle the problem, even if we don't have much choice. So I think it's worth looking to see if this method could cause regressions as well.

@lawnjelly lawnjelly marked this pull request as draft May 21, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants