-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return f32 and f64 in XMM0 instead of FP0 on i686 #115919
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @thomcc (or someone else) soon. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
library/std/src/f32/tests.rs
Outdated
// Ignore test on x87 floating point, these platforms do not guarantee NaN | ||
// payloads are preserved and flush denormals to zero, failing the tests. | ||
#[cfg(not(target_arch = "x86"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tests still should be ignored on pre-SSE x86 targets. We have a bunch of i586 tier2 targets and others might be running tests for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Running the i586 tests also required changes to rustc_ty_utils which I'm not happy with. I could not figure out how to properly detect SSE2 support - target.features is empty on i586-unknown-linux-gnu at that point.
2212887
to
19acbf4
Compare
@rustbot label +A-abi +A-floating-point +O-x86 -T-libs |
Can I ask that we not make floating-point ABI even more complicated on x86? Not to mention my headache trying to catalog all of the different x86 float ABIs. |
@rust-lang/opsem and probably @rust-lang/lang definitely needs to look at this for the reason in #116344. When compiling for i586, local |
Given that we are also discussing making SSE/SSE2 mandatory for i686 targets with SSE2 as part of their baseline features, I think we should forge ahead, because this actually makes it more like x86-64's ABI. |
compiler/rustc_ty_utils/src/abi.rs
Outdated
@@ -371,6 +371,8 @@ fn fn_abi_new_uncached<'tcx>( | |||
let target = &cx.tcx.sess.target; | |||
let target_env_gnu_like = matches!(&target.env[..], "gnu" | "musl" | "uclibc"); | |||
let win_x64_gnu = target.os == "windows" && target.arch == "x86_64" && target.env == "gnu"; | |||
let x86_sse2 = | |||
target.arch == "x86" && (target.features.contains("sse2") || target.cpu == "pentium4"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect you'd want to query the sse2
target feature rather than checking for the pentium4
cpu, since e.g. penryn
also supports sse2
but isn't the pentium4
. https://rust.godbolt.org/z/j7oPE6Wrf
also, what if there's a soft-float target for an OS kernel where sse2
would be disabled even if the cpu is pentium4
due to the kernel not saving SSE registers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GuentherVIII If you handle the aforementioned edge-cases appropriately, I believe we can get this approved. We want to only do this if the target is a 32-bit x86 CPU that has "baseline" features including SSE2, and preferably only with the function also satisfying the appropriate target feature (which we wish to make "always" but for now we can afford to check twice for correctness). This obviously includes Pentium 4 CPUs with no other target feature settings, but it won't include any that, say, are softfloat by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, in other words, only if on x86-32 and SSE2
is implied by the target structs/json ignoring any -C target-features
or -C target-cpu
or other overrides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does rust currently ask LLVM for the target features implied by the target cpu? I could only find the code that creates the list for #[cfg(target_feature)]
, which takes command-line options into account.
I've changed the implementation to check that list instead of target.features and target.cpu. This has the advantage that the standard library can activate the NAN-returning test precisely when it passes, and deactivating SSE with -C target-features still works.
There's also a second commit introducing yet another target option to make this configurable in the target specification and enabling it in all "i686" targets except i686_unknown_uefi, which disables SSE. The standard
I do not think we should make the ABI dependant on #[target_feature]
- unlike -C target-feature, the attribute cannot disable SSE. If someone disables SSE with -C target-feature and re-enables it for select functions with #[target_feature]
, this shouldn't result in incompatible function ABIs with the same function type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the issue you'll run into with -C target-feature=-sse
is that functions in your binary won't match the ABI of other code (such as std
) with SSE2 enabled in the default target features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's talk about forbidding going below the baseline. If you want less-than-baseline features you need a different target.
I think it just means we want to have both of these in the set of baseline features. But it means we should maybe block this PR on having the concept of "baseline features" rolled out so people get at least a warning when they disable a feature that should not be disabled? |
r? compiler |
…tion i686 already uses SSE2 to do calculations with f32 and f64, but the C calling convention uses the x87 stack to return values. The Rust calling convention does not need to do this, and LLVM makes it easy to use XMM0 instead, which saves move instructions and fixes problems with NaN values.
19acbf4
to
2b153e7
Compare
These commits modify compiler targets. |
This comment has been minimized.
This comment has been minimized.
Activating SSE2 with -C target-feature doesn't change the Rust ABI anymore, while deactivating it still does.
2b153e7
to
b7dfd48
Compare
r? compiler |
☔ The latest upstream changes (presumably #117716) made this pull request unmergeable. Please resolve the merge conflicts. |
Closing because #116584 was closed. |
This got superseded by #123351 which avoids the ABI issue by returning things in-memory. |
returning stuff in xmm0 is still better since it avoids extra load/store, so I think we still want this PR (or similar) but it's not as critical since the corrupting-float-returns bug is now fixed. |
i686 already uses SSE to do calculations with f32 and f64, but the C calling convention uses the x87 stack to return values. The Rust calling convention does not need to do this, and LLVM makes it easy to use XMM0 instead, which saves move instructions and fixes problems with NaN values.
See #115567, #91399 (comment) and this LLVM comment.