-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[lldb][AArch64] Fix Apple M4 on Linux #135563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This architecture implements SSVE but does not implement SVE.
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-lldb Author: Marcel Laverdet (laverdet) ChangesThis architecture implements SSVE but does not implement SVE. More information is included in #121693 cc: @DavidSpickett Full diff: https://github.com/llvm/llvm-project/pull/135563.diff 1 Files Affected:
diff --git a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
index 884c7d4b9e359..f540a160c901a 100644
--- a/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
+++ b/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
@@ -107,19 +107,19 @@ NativeRegisterContextLinux::CreateHostNativeRegisterContextLinux(
if (NativeProcessLinux::PtraceWrapper(PTRACE_GETREGSET,
native_thread.GetID(), ®set,
&ioVec, sizeof(sve_header))
- .Success()) {
+ .Success())
opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskSVE);
- // We may also have the Scalable Matrix Extension (SME) which adds a
- // streaming SVE mode.
- ioVec.iov_len = sizeof(sve_header);
- regset = NT_ARM_SSVE;
- if (NativeProcessLinux::PtraceWrapper(PTRACE_GETREGSET,
- native_thread.GetID(), ®set,
- &ioVec, sizeof(sve_header))
- .Success())
- opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskSSVE);
- }
+ // We may also have the Scalable Matrix Extension (SME) which adds
+ // a streaming SVE mode. Note that SVE and SSVE may implemented
+ // independently, which is true on Apple's M4 architecture.
+ ioVec.iov_len = sizeof(sve_header);
+ regset = NT_ARM_SSVE;
+ if (NativeProcessLinux::PtraceWrapper(PTRACE_GETREGSET,
+ native_thread.GetID(), ®set,
+ &ioVec, sizeof(sve_header))
+ .Success())
+ opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskSSVE);
sve::user_za_header za_header;
ioVec.iov_base = &za_header;
|
Closing this since Docker on macOS simply disabled SME, SVE, etc. I believe this is still an issue in theory for users with Linux installed directly on M4 hardware but if such a user exists I haven't heard of them. |
As the patch notes, Apple's M4 has the SME register & instructions, plus Streaming SVE Mode and the SVE register set, but most of the SVE instructions are not supported. And the SVE registers (z0-31, p0-15) are only available when the core is in Streaming SVE Mode I believe. I guess the main concern would be someone keying off of "this core has SVE registers" (true) and "this core can run SVE API tests" (most likely false). But as far as the patch goes, it looks good to me. While Docker might not virtualize the SME, the Darwin kernel does support this and Linux running in a VM will have access to these hardware resources on an M4 system. |
I think I can test this on Arm's Foundation Model, I will do that and get back to you. I have not checked it before now. |
Wouldn't this still be an issue if running virtualized Linux via some other app than Docker, e.g. VMWare, UTM, Parallels etc, not requiring fully native Linux? (AFAIK upstream Asahi Linux doesn't yet support M3/M4.) |
FWIW, it would be super convenient if QEMU could be set up to emulate this precise configuration. You don't happen to have connections to someone who could be prodded into implementing it? :-) |
@jannau helped figure out a bit more on this; it's probably not Docker itself that took any action on the matter, but an updated kernel probably did. See https://lore.kernel.org/linux-arm-kernel/20250103142635.1759674-1-maz@kernel.org/ - which is backported down to 6.6 now. This patch makes sure that the kernel doesn't enable the SVE2 (and other SVE related features) unless the main SVE feature is enabled. Therefore, this situation should only be an issue with older kernels, so perhaps it not something that regular user mode applications should need to worry about (unless specifically wanting to run with older kernels). That doesn't explain why SME no longer is enabled though, but that may be due to https://lore.kernel.org/qemu-devel/20250315061801.622606-21-mjt@tls.msk.ru/. (The remaining open question is whether Windows, virtualized on similar HW, has a similar condition for their SVE feature flags.) |
I have got Arm's Foundation Model to boot with SME only, these are the options if you are used to using shrinkwrap to run the model already:
(if you are not used to that, I'll write something up when I get some time) The cpuinfo is:
It reports Also, SME is disabled in kernel config for unrelated reasons, so I had to re-enable that:
I will find out what QEMU can do / plans to do. Assuming I reproduce the failures this PR aims to fix, I'll write up an issue with how to reproduce it without an actual M4 and we can mark this as the fix for it. |
I will test this PR with an older kernel as well. Would be nice if it works there too. |
A least for Linaro, there are no plans to implement this. QEMU doesn't try to be a completely general model so I suspect until there is a common CPU that does this, or some system standard that requires it, it would not be a priority. Someone else could try hacking it in and see if it's feasible to contribute such a mode. I have no estimate how much work it would be. In the meantime there's Arm's Foundation Model, though it only does whole system emulation. |
Sorry it's taken me ages to get to but I have finally tested this on Arm's Foundation Model and raised #138717 to document that. Initial impression is that this stops lldb-server crashing but there are issues debugging from there. We can consider merging this as a strict improvement over crashing on startup 🤣 But give me some time to try more examples and figure out the scale of changes to properly support this. |
From what I've seen, this is a decent start but there are further issues to be dealt with. Details on #138717. I have to work on some other SME changes first, so it will be a few weeks until I can do anything for this. @laverdet if you want to pursue this yourself in the meantime, feel free to do so. In which case you will find https://lldb.llvm.org/resources/debugging.html# useful, and you can try setting up the Foundation Model to test SVE+SME if you want, but since I'll want to test the changes myself anyway, easier to leave that to me. |
I think there are kernel issues that need to be fixed before all the LLDB features can work. So don't waste your own time on this right now, I will coordinate with Arm's kernel team to get this working. |
Did you manage to test things with an older kernel, at least on the level of what hwcaps are presented - to confirm you'd get the inconsistent hwcaps in that case (sve2 enabled, sve1 disabled)? |
I checked cpuinfo and hwcaps for 6.5 (doesn't have the fix) and 6.15 (does, and it was what I was using anyway). Same machine configuration, SME only.
The cpuinfo reports SME and SVE2, no SVE but some SVE sub features like svebf16 are there (though they might be part of SVE2).
With the fix it now reports SME and no SVE features at all. Decoding the HWCAPS I saw this difference:
Some of this is the kernel gaining new feature support. The relevant bits are that 6.15 removes HWCAP2_SVE2, HWCAP2_SVEI8MM, HWCAP2_SVEBF16 and HWCAP2_SVE_EBF16. And side note: a kernel developer told me that you can simulate this in qemu if you tell the kernel to hide the SVE feature using some sort of boot parameter. I haven't found out what yet, but the effect is equivalent for this purpose. |
So if you have code that wants to use SVE2 and may run on an SME only device with this older kernel, I think you could make it check for HWCAP_SVE and HWCAP2_SVE2. As the Architecture manual says:
Meaning that from userspace:
If you are making changes like this, I can double check with Arm's kernel team that your approach is what they expect software to do. I think what I've suggested would work, but not sure that the kernel authors want us doing it that way. |
Thanks! So for me it’s mainly whether I should pursue changes like https://code.videolan.org/videolan/dav1d/-/merge_requests/1787 in dav1d, ffmpeg and x264. (Individual functions only check the sve2 flag internally, e.g. https://code.videolan.org/videolan/dav1d/-/blob/1.5.1/src/arm/mc.h?ref_type=tags#L101. Changing all such occasions to check both sve and sve2 flags internally would be brittle. Therefore I could maybe do what that merge request does, to account for this in the internal setting up of flags.) But as this is only a historical issue with older kernels, I’m leaning towards just skipping it - at least until some user reports actually hitting it. |
Sounds sensible. If docker has bundled the fixed kernel, then that's most of the M4 users covered and anyone else can update their kernel. |
@laverdet I've been told we need kernel changes to handle parts of this. Those are planned, and I will work on the lldb side once they are available. In the meantime, this patch does prevent lldb crashing but I'm not comfortable merging it when other features won't work. If we get to the next release time and we don't have a complete solution we can consider whether to commit this as a temporary work around. If you do use lldb with this patch, I've no doubt you'll find other problems so please add them to #138717 so I know to check them with the new changes. |
This architecture implements SSVE but does not implement SVE.
More information is included in #121693
cc: @DavidSpickett