Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update gfx_sdl2.cpp to fix macOS Metal Stutter/Slowdown. #806

Closed
wants to merge 1 commit into from

Conversation

ReddestDream
Copy link
Contributor

@ReddestDream ReddestDream commented Jan 29, 2025

Fixes stutter/inability to lock 60 consistently on macOS using Metal. May also provide performance benefits under OpenGL and Linux. This is the weird 57-58 fps issue noticed on Metal graphics on Macs for a long while.

Thanks to @Spodi, @Malkierian, ProxySaw, and Fenrir for their help on Discord getting this figured out! :D

Fixes stutter/inability to lock 60 consistently on macOS using Metal. May also provide performance benefits under OpenGL and Linux.
@ReddestDream ReddestDream changed the title Update gfx_sdl2.cpp to fix macOS Metal Stutter. Update gfx_sdl2.cpp to fix macOS Metal Stutter/Slowdown. Jan 29, 2025
@larsy1995
Copy link
Contributor

I can confirm that this one works. Tested it a fair bit, game runs at a smooth fps except during loading/compilation stutters which is expected as of now.

@ReddestDream
Copy link
Contributor Author

ReddestDream commented Jan 29, 2025

Yeah. This does not fix shader compile stutters unfortunately. This is just for the main slowdown/stutter issue.

@larsy1995
Copy link
Contributor

Also, fyi, you've @ a random person instead of me, which is fair since you didn't know my GitHub account.

Copy link
Collaborator

@briaguya-ai briaguya-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this changes behavior not just on mac but also on linux, which leads into my main question of: where has this been tested?

linking a testing PR on a port would be great.

@@ -615,10 +615,8 @@ static inline void sync_framerate_with_timer() {

const int64_t next = previous_time + 10 * FRAME_INTERVAL_US_NUMERATOR / FRAME_INTERVAL_US_DENOMINATOR;
int64_t left = next - t;
#ifdef _WIN32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just trying to track down this change to see if it makes sense to have this on all platforms

#ifdef _WIN32
    // We want to exit a bit early, so we can busy-wait the rest to never miss the deadline
    left -= 15000UL;
#endif

was added in #330

blame prior to that was d600f10#diff-d1ca6db8c1db9fb811a90932b7e7e9d4d4bbea0b345cd67c4c73ea5c25a54628

-    t = SDL_GetPerformanceCounter();
+    t = qpc_to_100ns(SDL_GetPerformanceCounter());

-    const int64_t next = qpc_to_100ns(previous_time) + 10 * FRAME_INTERVAL_US_NUMERATOR / FRAME_INTERVAL_US_DENOMINATOR;
-    const int64_t left = next - qpc_to_100ns(t);
+    const int64_t next = previous_time + 10 * FRAME_INTERVAL_US_NUMERATOR / FRAME_INTERVAL_US_DENOMINATOR;
+    const int64_t left = next - t;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only tested on an M1 Air, macOS 15.2 so far. Will test on Steam Deck soon.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why am I on my school account suddenly?…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also test it on Windows if that is preferable?

@ReddestDream
Copy link
Contributor Author

it looks like this changes behavior not just on mac but also on linux, which leads into my main question of: where has this been tested?

It's been tested on Apple Silicon Macs. My personal M1 MBA and that of another person, Fenrir, on Discord (who is also larsy1995).

I'm hoping to test on Linux later, but if anyone can help me test, it would be appreciated. My understanding is that sched_yield() is the correct POSIX equivalent for YieldProcessor, so that should work there as well.

left -= 15000UL;

Regarding this, @Spodi suggested this was also necessary to go along with the first part. I agree it also needs testing outside of just Apple Silicon Macs.

@larsy1995
Copy link
Contributor

Here’s the build with those changes based on my Game Mode PR:
https://github.com/larsy1995/Shipwright/actions/runs/13037943737

@ReddestDream
Copy link
Contributor Author

Here’s the build with those changes based on my Game Mode PR: https://github.com/larsy1995/Shipwright/actions/runs/13037943737

If anyone can help test this build, particularly on Linux and Intel Macs, it would be greatly appreciated. There is no expected behavior change on Windows, but confirmation is always appreciated. Thanks!

Copy link
Collaborator

@briaguya-ai briaguya-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it might make sense to use mac ifdefs in here to isolate the changes, and then after that lands opening another PR removing the ifdefs and giving people on linux a chance to test it out

t = qpc_to_100ns(SDL_GetPerformanceCounter());
}
#endif
t = qpc_to_100ns(SDL_GetPerformanceCounter());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems strange to have this extra

t = qpc_to_100ns(SDL_GetPerformanceCounter());

in here.

it seems it was always an odd "extra" call for windows, and it makes sense that it existed for other platforms because all of the other ones were inside the windows ifdef

@ReddestDream
Copy link
Contributor Author

Okay. Let me clean up and redo. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants