Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch insertq #635

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Patch insertq #635

wants to merge 6 commits into from

Conversation

OFFTKP
Copy link
Contributor

@OFFTKP OFFTKP commented Aug 28, 2024

No description provided.

@georgemoralis
Copy link
Collaborator

Surely need to add a check for cpu's (AMD) that has SSE4.2a

@georgemoralis
Copy link
Collaborator

one game that uses it , is shadowman remaster

@OFFTKP
Copy link
Contributor Author

OFFTKP commented Aug 28, 2024

Reviewable

@@ -108,9 +118,7 @@ static Xbyak::Reg AllocateScratchRegister(
UNREACHABLE_MSG("Out of scratch registers!");
}

#ifdef __APPLE__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was very intentionally ifdef'd out on non-Apple as it relies on their conventions for TLS in the GS segment register.

Copy link
Contributor Author

@OFFTKP OFFTKP Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm in that case I could push/pop instead of using the Allocate functions if that has no side effects, or use some different storage

Copy link
Contributor

@squidbus squidbus Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my experiements writing this system you cannot reliably use the stack or any registers without saving them somewhere first. The stack because some games have functions that use stack area below the stack pointer, so you don't have any way to know for sure whether you're clobbering data or not.

My solution was to save registers to TLS since it needs to be per-thread storage (and I have some code that saves the stack to TLS and allows arbitrary use of a replacement stack to save registers), but it still relies on the Apple TLS convention as you can do a simple write to gs + offset without any scratch registers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, if there's a way to implement the save on other OSes without any scratch space that would resolve the issue. The rest of the code shouldn't have any OS-specific issues.

@OFFTKP OFFTKP force-pushed the insertq branch 3 times, most recently from e77e045 to cde1621 Compare September 1, 2024 13:21
src/core/cpu_patches.cpp Show resolved Hide resolved
src/core/cpu_patches.cpp Outdated Show resolved Hide resolved
@OFFTKP OFFTKP mentioned this pull request Sep 16, 2024
@OFFTKP OFFTKP force-pushed the insertq branch 2 times, most recently from 13c17b9 to d46b944 Compare September 23, 2024 16:35
@OFFTKP OFFTKP force-pushed the insertq branch 2 times, most recently from e196849 to 62a28b2 Compare September 23, 2024 17:13
@OFFTKP
Copy link
Contributor Author

OFFTKP commented Sep 23, 2024

Changed some stuff so it works with the illegal instruction handler, feel free to review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants