-
Couldn't load subscription status.
- Fork 13.9k
Add classify and related methods for f16 and f128 #127020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d095e37 to
6e8e9f2
Compare
|
@rustbot label +F-f16_and_f128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of nits with the comments to make them more relevant to the specific types.
library/core/src/num/f16.rs
Outdated
| // On some processors, in some cases, LLVM will "helpfully" lower floating point ops, | ||
| // in spite of a request for them using standard IEEE types, to things like x87 operations. | ||
| // These have an f64's mantissa, but can have a larger than normal exponent. | ||
| // FIXME(jubilee): Using x87 operations is never necessary in order to function | ||
| // on x86 processors for Rust-to-Rust calls, so this issue should not happen. | ||
| // Code generation should be adjusted to use non-C calling conventions, avoiding this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On x86, LLVM appears to correctly round f16s between each operation even when SSE is disabled and x87 operations are used (when SSE is disabled they are passed and returned like u16s), so x87 shouldn't have any issues of this kind with f16. You could change this (and the various other x87 mentions in this file) to instead mention WASM, which has issues with using f32 precision when f16 is used (llvm/llvm-project#96437 and llvm/llvm-project#96438), although since the examples are mainly illustrative, I wouldn't think this is a blocker or anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded this as well as the f128 version. If you have specific wording that is more accurate, I can update it again.
I also removed partial_classify from f128 and switched to only use classify_bits, since I don't think there is any case where that can wind up wrong because of extension/truncation, unlike the other types. Let me know if that is inaccurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording looks good. There's still a paragraph about x87 in f16::from_bits that you can probably just remove (or mention the WASM issues there instead).
You can replace the entire f128::classify() implementation with Self::classify_bits(mem::transmute(self)): as you say, there aren't the same excess precision problems that other floating-point types have, and just classifying the bits will save on calls to compiler builtins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better, I did change to just using classify_bits.
The f16 paragraph should be gone now, just lost that change in my rebase.
6e8e9f2 to
a1939c5
Compare
a1939c5 to
128b8db
Compare
|
I know too little about the current state of |
128b8db to
d8c58a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, can you doublecheck if these doctests are correctly configured-off?
|
@workingjubilee they are all block-level |
| // SAFETY: `u16` is a plain old datatype so we can always transmute to it. | ||
| // ...sorta. | ||
| // | ||
| // It turns out that at runtime, it is possible for a floating point number | ||
| // to be subject to a floating point mode that alters nonzero subnormal numbers | ||
| // to zero on reads and writes, aka "denormals are zero" and "flush to zero". | ||
| // | ||
| // And, of course evaluating to a NaN value is fairly nondeterministic. | ||
| // More precisely: when NaN should be returned is knowable, but which NaN? | ||
| // So far that's defined by a combination of LLVM and the CPU, not Rust. | ||
| // This function, however, allows observing the bitstring of a NaN, | ||
| // thus introspection on CTFE. | ||
| // | ||
| // In order to preserve, at least for the moment, const-to-runtime equivalence, | ||
| // we reject any of these possible situations from happening. | ||
| #[rustc_const_unstable(feature = "const_float_bits_conv", issue = "72447")] | ||
| const fn ct_f16_to_u16(ct: f16) -> u16 { | ||
| // FIXME(f16_f128): we should use `.classify()` like `f32` and `f64`, but we don't | ||
| // yet want to rely on that on all platforms (ABI issues). So just classify the | ||
| // bits instead. | ||
|
|
||
| // SAFETY: this direction is a POD transmutation. We just can't return it unless | ||
| // it is normal, infinite, or zero. | ||
| let bits = unsafe { mem::transmute::<f16, u16>(ct) }; | ||
| match f16::classify_bits(bits) { | ||
| FpCategory::Nan => { | ||
| panic!("const-eval error: cannot use f16::to_bits on a NaN") | ||
| } | ||
| FpCategory::Subnormal => { | ||
| panic!("const-eval error: cannot use f16::to_bits on a subnormal number") | ||
| } | ||
| FpCategory::Infinite | FpCategory::Normal | FpCategory::Zero => bits, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason that I did it the other way around is that if we transmute a float, or do an operation on it otherwise, LLVM may change its mind about what the bits were, which makes me leery of classifying the bits post-facto like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that this would be better, but unfortunately calling .classify() is just nondeterministic on x86 because of ABI bugs. So while this depends on LLVM's consistency, I think it is better than producing a probably wrong answer (and the results are correct at this time on at least x86 and aarch64). At least to get things off the ground.
Anyway that's what the FIXME is supposed to be for, but happy adjust as you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
Can we be more specific in the comment than "ABI issues", then? Having done a lot of research on them, I feel like they're a giant umbrella of handwavium that people shove stuff under. "classify is nondet on x86, even before we involve the MXCSR" is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Const to_bits isn't strictly needed at this time of course, I just included it because it unblocks a surprising number of tests (especially clippy), either directly or through other const functions that rely on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed your comment, I updated this to be more exact.
d8c58a2 to
7258951
Compare
library/core/src/num/f128.rs
Outdated
| // SAFETY: this direction is a POD transmutation. We just can't return it unless | ||
| // it is normal, infinite, or zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already stated we don't want to return it and why, so minimize the SAFETY comment, assuming it is correct ( but see my other remark here: https://github.com/rust-lang/rust/pull/127020/files#r1677218387 ).
| // SAFETY: this direction is a POD transmutation. We just can't return it unless | |
| // it is normal, infinite, or zero. | |
| // SAFETY: this is a POD transmutation |
alternately,
| // SAFETY: this direction is a POD transmutation. We just can't return it unless | |
| // it is normal, infinite, or zero. | |
| // SAFETY: all 65536 bitpatterns can go in, all 65536 bitpatterns can go out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean all 340282366920938463463374607431768211456 bit patterns in this file :D updated to the simpler one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right :^)
|
...oh, I completely missed the trailing |
|
@tgross35 could we put the opening |
69afb49 to
00f3417
Compare
|
oh yes, it is not. |
|
@bors r- same thing |
|
@bors cancel |
These constifications were blocked on classification functions being added. Now that those methods are available, constify them. This brings things more in line with `f32` and `f64`.
421ca1a to
2393093
Compare
|
I got the float->int functions last time but not int->float 🤦. Hope this is about the last thing. @rustbot ready |
|
@bors r+ rollup=iffy |
|
☀️ Test successful - checks-actions |
|
Finished benchmarking commit (eb72697): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults (secondary -3.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 700.525s -> 699.024s (-0.21%) |
Also constify some functions where that was blocked on classify being available.
r? libs