Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8318228: RISC-V: C2 ConvF2HF #17450

Closed
wants to merge 3 commits into from

Conversation

Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Jan 16, 2024

Hi,
Can you review the patch to add ConvF2HF intrinsic to JDK for riscv?
Thanks!

Test

Functionality

hotspot tests

test/hotspot/jtreg/compiler/intrinsics/
test/hotspot/jtreg/compiler/c2/irTests

jdk tests

test/jdk/java/lang/Float/Binary16Conversion*.java

Performance

tested on licheepi.

with UseZfh enabled

Benchmark                                     (size)  Mode  Cnt     Score   Error  Units
Fp16ConversionBenchmark.floatToFloat16          2048  avgt    2  4170.549          ns/op
Fp16ConversionBenchmark.floatToFloat16Memory    2048  avgt    2    21.492          ns/op

with UseZfh disabled

(i.e. disable the intrinsic)

Benchmark                                     (size)  Mode  Cnt      Score   Error  Units
Fp16ConversionBenchmark.floatToFloat16          2048  avgt    2  25036.647          ns/op
Fp16ConversionBenchmark.floatToFloat16Memory    2048  avgt    2     27.326          ns/op

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17450/head:pull/17450
$ git checkout pull/17450

Update a local copy of the PR:
$ git checkout pull/17450
$ git pull https://git.openjdk.org/jdk.git pull/17450/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17450

View PR using the GUI difftool:
$ git pr show -t 17450

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17450.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 16, 2024

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 16, 2024
@openjdk
Copy link

openjdk bot commented Jan 16, 2024

@Hamlin-Li The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jan 16, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 16, 2024

Webrevs

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Thanks for this change. I have a small question.

__ fmv_x_w(dst, src);

// preserve the payloads of non-canonical NaNs.
__ srai(dst, dst, 13);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the lowest 13 bits of the payload for src is simply discarded here. But these bits are also used for calculating the new significand bits for float16 [1]. So this doesn't seem OK to me. Did I miss anything?

[1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Float.java#L1112-L1113

Copy link
Author

@Hamlin-Li Hamlin-Li Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's discarded intentionally, just like in HF2F it's padded with zero in lower 13 bits

Copy link
Author

@Hamlin-Li Hamlin-Li Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, I'm not sure.
I have below patch:

diff --git a/src/java.base/share/classes/java/lang/Float.java b/src/java.base/share/classes/java/lang/Float.java
index 7508c22d7f4..f96e23b568e 100644
--- a/src/java.base/share/classes/java/lang/Float.java
+++ b/src/java.base/share/classes/java/lang/Float.java
@@ -1108,9 +1108,7 @@ public static short floatToFloat16(float f) {
                     // Preserve high order bit of float NaN in the
                     // binary16 result NaN (tenth bit); OR in remaining
                     // bits into lower 9 bits of binary 16 significand.
-                    | (doppel & 0x007f_e000) >> 13 // 10 bits
-                    | (doppel & 0x0000_1ff0) >> 4  //  9 bits
-                    | (doppel & 0x0000_000f));     //  4 bits
+                    | (doppel & 0x007f_e000) >> 13); // 10 bits
         }
 
         float abs_f = Math.abs(f);

And, test/jdk/java/lang/Float/Binary16ConversionNaN.java/Binary16Conversion.java both passed.

Either the tests(both library and hotspot) + intrinsics (not sure if intrinsics on other platforms need improvement) needs to be improved, or the code in library needs to be simplified. (To be frank, I don't think NaN needs such a complicated spec/design, but it depends on the spec).

I just filed a library bug to discuss it, JDK-8324212

Copy link
Author

@Hamlin-Li Hamlin-Li Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion at: https://mail.openjdk.org/pipermail/riscv-port-dev/2022-December/000706.html, when NaN is used as input and output is also NaN, then there is no restriction on the exact NaN number, this is confirmed by the java library team. That means we can return any NaN if the input is an NaN. For floattoFloat16, the java spec is similar, floatToFloat16: If the argument is a NaN, the result is a NaN., it does not require how the result NaN layout should be.

So, I think we're fine to just discard the last 13 bits (as implemented in this patch), and in this way it's convenient for us as this will help to pass all the existing tests in hotspot.

But, for long term and performance consideration, I think we need to re-visit all the NaN related intrinsics in riscv to check if we need to treat NaN specially rather than just leveraging the riscv default behaviour. And I filed a bug to track this task: JDK-8324303

// in riscv, NaN needs a special process as fcvt does not work in that case.

// check whether it's a NaN.
fclass_s(t0, src);
Copy link

@VladimirKempik VladimirKempik Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As showed roundD intrinsic PR, ( https://github.com/openjdk/jdk/pull/16382/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR4252 ) the feq_s(t0, src, src) + beqz(t0, label) seems to be a faster check for NaN, could you check the jmh numbers with feq_s ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Yes, it bring better performance.

After:

Benchmark                                     (size)  Mode  Cnt     Score    Error  Units
Fp16ConversionBenchmark.floatToFloat16          2048  avgt    5  3753.179 ? 43.557  ns/op
Fp16ConversionBenchmark.floatToFloat16Memory    2048  avgt    5    19.772 ?  0.860  ns/op

Before:

Benchmark                                     (size)  Mode  Cnt     Score    Error  Units
Fp16ConversionBenchmark.floatToFloat16          2048  avgt    5  4099.820 ? 57.671  ns/op
Fp16ConversionBenchmark.floatToFloat16Memory    2048  avgt    5    20.181 ?  0.108  ns/op

@Hamlin-Li
Copy link
Author

Hamlin-Li commented Jan 23, 2024

@RealFYang
Based on the latest confirmation, we don't have to implemented in the same way as Float.java.

And per performance and test consideration, it's convenient for us to implement in current way.

But, for long term and performance consideration, I filed a bug to re-visit all the NaN related intrinsics in riscv: JDK-8324303

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right then! Thanks for finding this out.

@openjdk
Copy link

openjdk bot commented Jan 23, 2024

@Hamlin-Li This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8318228: RISC-V: C2 ConvF2HF

Reviewed-by: fyang, vkempik

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 89 new commits pushed to the master branch:

  • 3696765: 8323964: runtime/Thread/ThreadCountLimit.java fails intermittently on AIX
  • 5a74c2a: 8323438: Enhance assertions for Windows sync API failures
  • 52523d3: 8324050: Issue store-store barrier after re-materializing objects during deoptimization
  • df370d7: 8314329: AgeTable: add is_clear() & allocation spec, and relax assert to allow use of 0-index slot
  • 0d8543d: 8324065: Daylight saving information for Africa/Casablanca are incorrect
  • c9cacfb: 8323657: Compilation of snippet results in VerifyError at runtime with --release 9 (and above)
  • bde650f: 8322282: Incorrect LoaderConstraintTable::add_entry after JDK-8298468
  • be943a9: 8321984: IGV: Upgrade to Netbeans Platform 20
  • d3b2ac1: 8314186: runtime/8176717/TestInheritFD.java failed with "Log file was leaked"
  • 72f1990: 8323057: Recoverable errors may be reported before unrecoverable errors when annotation processing is skipped
  • ... and 79 more: https://git.openjdk.org/jdk/compare/b3634722655901b8d3e43dd1f8aa2b4487509a34...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 23, 2024
@Hamlin-Li
Copy link
Author

@VladimirKempik @RealFYang Thanks for your review.

/integrate

@openjdk
Copy link

openjdk bot commented Jan 23, 2024

Going to push as commit bcaad51.
Since your change was applied there have been 92 commits pushed to the master branch:

  • 5acd37f: 8324207: Serial: Remove Space::set_saved_mark_word
  • f5e6d11: 8324210: Serial: Remove unused methods in Generation
  • bcb340d: 8324286: Fix backsliding on use of nullptr instead of NULL
  • 3696765: 8323964: runtime/Thread/ThreadCountLimit.java fails intermittently on AIX
  • 5a74c2a: 8323438: Enhance assertions for Windows sync API failures
  • 52523d3: 8324050: Issue store-store barrier after re-materializing objects during deoptimization
  • df370d7: 8314329: AgeTable: add is_clear() & allocation spec, and relax assert to allow use of 0-index slot
  • 0d8543d: 8324065: Daylight saving information for Africa/Casablanca are incorrect
  • c9cacfb: 8323657: Compilation of snippet results in VerifyError at runtime with --release 9 (and above)
  • bde650f: 8322282: Incorrect LoaderConstraintTable::add_entry after JDK-8298468
  • ... and 82 more: https://git.openjdk.org/jdk/compare/b3634722655901b8d3e43dd1f8aa2b4487509a34...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 23, 2024
@openjdk openjdk bot closed this Jan 23, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 23, 2024
@openjdk
Copy link

openjdk bot commented Jan 23, 2024

@Hamlin-Li Pushed as commit bcaad51.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@Hamlin-Li Hamlin-Li deleted the float-to-float16 branch January 29, 2024 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants