JDK-8318228: RISC-V: C2 ConvF2HF #17450

Hamlin-Li · 2024-01-16T18:43:03Z

Hi,
Can you review the patch to add ConvF2HF intrinsic to JDK for riscv？
Thanks!

Test

Functionality

hotspot tests

test/hotspot/jtreg/compiler/intrinsics/
test/hotspot/jtreg/compiler/c2/irTests

jdk tests

test/jdk/java/lang/Float/Binary16Conversion*.java

Performance

tested on licheepi.

with UseZfh enabled

Benchmark                                     (size)  Mode  Cnt     Score   Error  Units
Fp16ConversionBenchmark.floatToFloat16          2048  avgt    2  4170.549          ns/op
Fp16ConversionBenchmark.floatToFloat16Memory    2048  avgt    2    21.492          ns/op

with UseZfh disabled

(i.e. disable the intrinsic)

Benchmark                                     (size)  Mode  Cnt      Score   Error  Units
Fp16ConversionBenchmark.floatToFloat16          2048  avgt    2  25036.647          ns/op
Fp16ConversionBenchmark.floatToFloat16Memory    2048  avgt    2     27.326          ns/op

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8318228: RISC-V: C2 ConvF2HF (Sub-task - P4)

Reviewers

Fei Yang (@RealFYang - Reviewer)
Vladimir Kempik (@VladimirKempik - Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17450/head:pull/17450
$ git checkout pull/17450

Update a local copy of the PR:
$ git checkout pull/17450
$ git pull https://git.openjdk.org/jdk.git pull/17450/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17450

View PR using the GUI difftool:
$ git pr show -t 17450

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17450.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2024-01-16T18:43:54Z

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2024-01-16T18:45:37Z

@Hamlin-Li The following label will be automatically applied to this pull request:

hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2024-01-16T18:48:32Z

Webrevs

01: Full - Incremental (01ec31c2)
00: Full (c732a3ff)

RealFYang

Hi, Thanks for this change. I have a small question.

RealFYang · 2024-01-18T07:07:30Z

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp

+  __ fmv_x_w(dst, src);
+
+  // preserve the payloads of non-canonical NaNs.
+  __ srai(dst, dst, 13);


I see the lowest 13 bits of the payload for src is simply discarded here. But these bits are also used for calculating the new significand bits for float16 [1]. So this doesn't seem OK to me. Did I miss anything?

[1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Float.java#L1112-L1113

It's discarded intentionally, just like in HF2F it's padded with zero in lower 13 bits

Right now, I'm not sure.
I have below patch:

diff --git a/src/java.base/share/classes/java/lang/Float.java b/src/java.base/share/classes/java/lang/Float.java index 7508c22d7f4..f96e23b568e 100644 --- a/src/java.base/share/classes/java/lang/Float.java +++ b/src/java.base/share/classes/java/lang/Float.java @@ -1108,9 +1108,7 @@ public static short floatToFloat16(float f) { // Preserve high order bit of float NaN in the // binary16 result NaN (tenth bit); OR in remaining // bits into lower 9 bits of binary 16 significand. - | (doppel & 0x007f_e000) >> 13 // 10 bits - | (doppel & 0x0000_1ff0) >> 4 // 9 bits - | (doppel & 0x0000_000f)); // 4 bits + | (doppel & 0x007f_e000) >> 13); // 10 bits } float abs_f = Math.abs(f);

And, test/jdk/java/lang/Float/Binary16ConversionNaN.java/Binary16Conversion.java both passed.

Either the tests(both library and hotspot) + intrinsics (not sure if intrinsics on other platforms need improvement) needs to be improved, or the code in library needs to be simplified. (To be frank, I don't think NaN needs such a complicated spec/design, but it depends on the spec).

I just filed a library bug to discuss it, JDK-8324212

Per discussion at: https://mail.openjdk.org/pipermail/riscv-port-dev/2022-December/000706.html, when NaN is used as input and output is also NaN, then there is no restriction on the exact NaN number, this is confirmed by the java library team. That means we can return any NaN if the input is an NaN. For floattoFloat16, the java spec is similar, floatToFloat16: If the argument is a NaN, the result is a NaN., it does not require how the result NaN layout should be.

So, I think we're fine to just discard the last 13 bits (as implemented in this patch), and in this way it's convenient for us as this will help to pass all the existing tests in hotspot.

But, for long term and performance consideration, I think we need to re-visit all the NaN related intrinsics in riscv to check if we need to treat NaN specially rather than just leveraging the riscv default behaviour. And I filed a bug to track this task: JDK-8324303

VladimirKempik · 2024-01-19T21:30:30Z

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp

+  // in riscv, NaN needs a special process as fcvt does not work in that case.
+
+  // check whether it's a NaN.
+  fclass_s(t0, src);


As showed roundD intrinsic PR, ( https://github.com/openjdk/jdk/pull/16382/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR4252 ) the feq_s(t0, src, src) + beqz(t0, label) seems to be a faster check for NaN, could you check the jmh numbers with feq_s ?

Thanks for the suggestion. Yes, it bring better performance.

After:

Benchmark (size) Mode Cnt Score Error Units Fp16ConversionBenchmark.floatToFloat16 2048 avgt 5 3753.179 ? 43.557 ns/op Fp16ConversionBenchmark.floatToFloat16Memory 2048 avgt 5 19.772 ? 0.860 ns/op

Before:

Benchmark (size) Mode Cnt Score Error Units Fp16ConversionBenchmark.floatToFloat16 2048 avgt 5 4099.820 ? 57.671 ns/op Fp16ConversionBenchmark.floatToFloat16Memory 2048 avgt 5 20.181 ? 0.108 ns/op

Hamlin-Li · 2024-01-23T08:24:47Z

@RealFYang
Based on the latest confirmation, we don't have to implemented in the same way as Float.java.

And per performance and test consideration, it's convenient for us to implement in current way.

But, for long term and performance consideration, I filed a bug to re-visit all the NaN related intrinsics in riscv: JDK-8324303

RealFYang

All right then! Thanks for finding this out.

openjdk · 2024-01-23T08:33:48Z

@Hamlin-Li This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8318228: RISC-V: C2 ConvF2HF

Reviewed-by: fyang, vkempik

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 89 new commits pushed to the master branch:

3696765: 8323964: runtime/Thread/ThreadCountLimit.java fails intermittently on AIX
5a74c2a: 8323438: Enhance assertions for Windows sync API failures
52523d3: 8324050: Issue store-store barrier after re-materializing objects during deoptimization
df370d7: 8314329: AgeTable: add is_clear() & allocation spec, and relax assert to allow use of 0-index slot
0d8543d: 8324065: Daylight saving information for Africa/Casablanca are incorrect
c9cacfb: 8323657: Compilation of snippet results in VerifyError at runtime with --release 9 (and above)
bde650f: 8322282: Incorrect LoaderConstraintTable::add_entry after JDK-8298468
be943a9: 8321984: IGV: Upgrade to Netbeans Platform 20
d3b2ac1: 8314186: runtime/8176717/TestInheritFD.java failed with "Log file was leaked"
72f1990: 8323057: Recoverable errors may be reported before unrecoverable errors when annotation processing is skipped
... and 79 more: https://git.openjdk.org/jdk/compare/b3634722655901b8d3e43dd1f8aa2b4487509a34...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Hamlin-Li · 2024-01-23T12:00:43Z

@VladimirKempik @RealFYang Thanks for your review.

/integrate

openjdk · 2024-01-23T12:02:29Z

Going to push as commit bcaad51.
Since your change was applied there have been 92 commits pushed to the master branch:

5acd37f: 8324207: Serial: Remove Space::set_saved_mark_word
f5e6d11: 8324210: Serial: Remove unused methods in Generation
bcb340d: 8324286: Fix backsliding on use of nullptr instead of NULL
3696765: 8323964: runtime/Thread/ThreadCountLimit.java fails intermittently on AIX
5a74c2a: 8323438: Enhance assertions for Windows sync API failures
52523d3: 8324050: Issue store-store barrier after re-materializing objects during deoptimization
df370d7: 8314329: AgeTable: add is_clear() & allocation spec, and relax assert to allow use of 0-index slot
0d8543d: 8324065: Daylight saving information for Africa/Casablanca are incorrect
c9cacfb: 8323657: Compilation of snippet results in VerifyError at runtime with --release 9 (and above)
bde650f: 8322282: Incorrect LoaderConstraintTable::add_entry after JDK-8298468
... and 82 more: https://git.openjdk.org/jdk/compare/b3634722655901b8d3e43dd1f8aa2b4487509a34...master

Your commit was automatically rebased without conflicts.

openjdk · 2024-01-23T12:02:36Z

@Hamlin-Li Pushed as commit bcaad51.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Hamlin-Li added 2 commits January 16, 2024 16:38

Initial commit: float to float16

a2b5025

clean code

c732a3f

openjdk bot added the rfr Pull request is ready for review label Jan 16, 2024

openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jan 16, 2024

RealFYang reviewed Jan 18, 2024

View reviewed changes

VladimirKempik reviewed Jan 19, 2024

View reviewed changes

replace fclass with feq as performance optimization.

01ec31c

RealFYang approved these changes Jan 23, 2024

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jan 23, 2024

VladimirKempik approved these changes Jan 23, 2024

View reviewed changes

openjdk bot added the integrated Pull request has been integrated label Jan 23, 2024

openjdk bot closed this Jan 23, 2024

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 23, 2024

Hamlin-Li deleted the float-to-float16 branch January 29, 2024 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JDK-8318228: RISC-V: C2 ConvF2HF #17450

JDK-8318228: RISC-V: C2 ConvF2HF #17450

Hamlin-Li commented Jan 16, 2024 •

edited by openjdk bot

Loading

bridgekeeper bot commented Jan 16, 2024

openjdk bot commented Jan 16, 2024

mlbridge bot commented Jan 16, 2024 •

edited

Loading

RealFYang left a comment

RealFYang Jan 18, 2024

Hamlin-Li Jan 19, 2024 •

edited

Loading

Hamlin-Li Jan 19, 2024 •

edited

Loading

Hamlin-Li Jan 22, 2024 •

edited

Loading

VladimirKempik Jan 19, 2024 •

edited

Loading

Hamlin-Li Jan 22, 2024

Hamlin-Li commented Jan 23, 2024 •

edited

Loading

RealFYang left a comment

openjdk bot commented Jan 23, 2024 •

edited

Loading

Hamlin-Li commented Jan 23, 2024

openjdk bot commented Jan 23, 2024

openjdk bot commented Jan 23, 2024

JDK-8318228: RISC-V: C2 ConvF2HF #17450

JDK-8318228: RISC-V: C2 ConvF2HF #17450

Conversation

Hamlin-Li commented Jan 16, 2024 • edited by openjdk bot Loading

Test

Functionality

hotspot tests

jdk tests

Performance

with UseZfh enabled

with UseZfh disabled

Progress

Issue

Reviewers

Reviewing

Webrev

bridgekeeper bot commented Jan 16, 2024

openjdk bot commented Jan 16, 2024

mlbridge bot commented Jan 16, 2024 • edited Loading

Webrevs

RealFYang left a comment

Choose a reason for hiding this comment

RealFYang Jan 18, 2024

Choose a reason for hiding this comment

Hamlin-Li Jan 19, 2024 • edited Loading

Choose a reason for hiding this comment

Hamlin-Li Jan 19, 2024 • edited Loading

Choose a reason for hiding this comment

Hamlin-Li Jan 22, 2024 • edited Loading

Choose a reason for hiding this comment

VladimirKempik Jan 19, 2024 • edited Loading

Choose a reason for hiding this comment

Hamlin-Li Jan 22, 2024

Choose a reason for hiding this comment

Hamlin-Li commented Jan 23, 2024 • edited Loading

RealFYang left a comment

Choose a reason for hiding this comment

openjdk bot commented Jan 23, 2024 • edited Loading

Hamlin-Li commented Jan 23, 2024

openjdk bot commented Jan 23, 2024

openjdk bot commented Jan 23, 2024

Hamlin-Li commented Jan 16, 2024 •

edited by openjdk bot

Loading

mlbridge bot commented Jan 16, 2024 •

edited

Loading

Hamlin-Li Jan 19, 2024 •

edited

Loading

Hamlin-Li Jan 19, 2024 •

edited

Loading

Hamlin-Li Jan 22, 2024 •

edited

Loading

VladimirKempik Jan 19, 2024 •

edited

Loading

Hamlin-Li commented Jan 23, 2024 •

edited

Loading

openjdk bot commented Jan 23, 2024 •

edited

Loading