Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8333245: RISC-V: UseRVV option can't be enabled after JDK-8316859 #19472

Closed
wants to merge 1 commit into from

Conversation

zifeihan
Copy link
Member

@zifeihan zifeihan commented May 30, 2024

Because some dev boards only support RVV version 0.7, In JDK-8316859 we masked the use of HWCAP to probe for RVV extensions, and in the meantime, we can use hwprobe to probe for V extensions in Linux kernel 6.5 and above. But recently we got Banana Pi BPI-F3 board (has RVV1.0), but his kernel is 6.1.15, so the V extensions detected by HWCAP are masked. And we get the warning: RVV is not supported on this CPU when we enable UseRVV with the command, and we can't enable UseRVV correctly.

Without Patch:

zifeihan@bananapif3:~/jre/jdk/bin$ ./java -XX:+PrintFlagsFinal -XX:+UseRVV -version | grep UseRVV
OpenJDK 64-Bit Server VM warning: RVV is not supported on this CPU
     bool UseRVV                                   = false                                {ARCH product} {command line}
     bool UseRVVForBigIntegerShiftIntrinsics       = false                                {ARCH product} {default}
openjdk version "23-internal" 2024-09-17
OpenJDK Runtime Environment (build 23-internal-adhoc.zifeihan.jdk)
OpenJDK 64-Bit Server VM (build 23-internal-adhoc.zifeihan.jdk, mixed mode)

With Patch:

zifeihan@bananapif3:~/jre/jdk/bin$ ./java -XX:+PrintFlagsFinal -XX:+UseRVV -version | grep UseRVV
     bool UseRVV                                   = true                                 {ARCH product} {command line}
     bool UseRVVForBigIntegerShiftIntrinsics       = true                                 {ARCH product} {default}
openjdk version "23-internal" 2024-09-17
OpenJDK Runtime Environment (build 23-internal-adhoc.zifeihan.jdk)
OpenJDK 64-Bit Server VM (build 23-internal-adhoc.zifeihan.jdk, mixed mode)

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8333245: RISC-V: UseRVV option can't be enabled after JDK-8316859 (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19472/head:pull/19472
$ git checkout pull/19472

Update a local copy of the PR:
$ git checkout pull/19472
$ git pull https://git.openjdk.org/jdk.git pull/19472/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19472

View PR using the GUI difftool:
$ git pr show -t 19472

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19472.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 30, 2024

👋 Welcome back gcao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 30, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented May 30, 2024

@zifeihan The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label May 30, 2024
@zifeihan zifeihan marked this pull request as ready for review May 30, 2024 09:18
@openjdk openjdk bot added the rfr Pull request is ready for review label May 30, 2024
@mlbridge
Copy link

mlbridge bot commented May 30, 2024

Webrevs

@robehn
Copy link
Contributor

robehn commented May 30, 2024

If you don't have RVV you are not guaranteed to have Zicsr, meaning the csrr read of vlenb may crash ?
If you have csr, what will that return in this case? (no V but Zicsr)

You also need kernel support for RVV:

  • You must turn on V from privilege mode, the kernel needs to do this.
  • If you are context switched in the middle of your vector code the kernel must saves all those V registers.

Only kernels with hw_probe that is reporting RVV is guaranteed todo this.

If this is a vanilla 6.1.15 you can't use V AFIAK. If there are out of tree patches on top of this to make V work, they need to add the hw_probe patches also.

So I would suggest something like:

  • If UseRVV and hwcap V = true but no hwprobe.
  • Test if we can csrr in a safe fetch manor.
  • If we can, we try to read vector context status field, VS, to determine if it's on or off. (in a 'safe fetch' manor)
  • If that succeds, we store something in v0, change CPU using affinity mask, and verify that v0 contains that value after the change of CPU.
  • Now you just need to cross fingers that it is v1.0 :) (this will still fail on THEAD)

Or similar as we don't want the VM to crash just because an user added +UseRVV erroneously.

Note in 6.7 there is prctl(PR_RISCV_V_SET_CONTROL, unsigned long arg) to turn on/off V for a thread.

@RealFYang
Copy link
Member

RealFYang commented May 30, 2024

If you don't have RVV you are not guaranteed to have Zicsr, meaning the csrr read of vlenb may crash ? If you have csr, what will that return in this case? (no V but Zicsr)

You also need kernel support for RVV:

  • You must turn on V from privilege mode, the kernel needs to do this.
  • If you are context switched in the middle of your vector code the kernel must saves all those V registers.

Only kernels with hw_probe that is reporting RVV is guaranteed todo this.

If this is a vanilla 6.1.15 you can't use V AFIAK. If there are out of tree patches on top of this to make V work, they need to add the hw_probe patches also.

That makes sense to me. I think you mean kernel versions >= 6.5 where process context switch and hwprobe support for RVV are both added at the same time [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/commit/?h=for-next&id=d5e45e810e0e08114035d31d88049544c038e6fc

So I would suggest something like:

  • If UseRVV and hwcap V = true but no hwprobe.
  • Test if we can csrr in a safe fetch manor.
  • If we can, we try to read vector context status field, VS, to determine if it's on or off. (in a 'safe fetch' manor)
  • If that succeds, we store something in v0, change CPU using affinity mask, and verify that v0 contains that value after the change of CPU.
  • Now you just need to cross fingers that it is v1.0 :) (this will still fail on THEAD)

Or similar as we don't want the VM to crash just because an user added +UseRVV erroneously.

Note in 6.7 there is prctl(PR_RISCV_V_SET_CONTROL, unsigned long arg) to turn on/off V for a thread.

That sounds very tricky given that we have both rvv-0.7.1 and rvv-1.0 hardwares for now. I think it will be safer and simpler for us to rely on the availability of hwprobe syscall. But I guess will won't be a big issue when we do some simple performance evaluations like a simple JMH run, but yes, you have to change the code to force enable UseRVV when running on the older kernels.

@robehn
Copy link
Contributor

robehn commented May 30, 2024

Another suggestion, it seem like you can ge the triplet mvendorid/marchid/mimpid from /proc/cpuinfo.
So if we can grab those from VM_Version::os_uarch_additional_features() when available and no hwprobe.
We can set those 3, and in VM_Version::vendor_features() check if this is BananPie.
With big warning that kernel do not support vector let user run with vector ?

So that way THEAD is unaffected.

From random internet user:

bananapif3:~$ cat /proc/cpuinfo
processor : 0
hart : 0
model name : Spacemit(R) X60
isa : rv64imafdcv_sscofpmf_sstc_svpbmt_zicbom_zicboz_zicbop_zihintpause
mmu : sv39
mvendorid : 0x710
marchid : 0x8000000058000001
mimpid : 0x1000000049772200

@RealFYang
Copy link
Member

With big warning that kernel do not support vector let user run with vector ?

That does not make sense to me. In case of JMH performance evaluation, we can simply revert JDK-8316859.
BTW: I see people are offering feedbacks to the vendor about constraints posed by the older kernels.

@zifeihan
Copy link
Member Author

@robehn @RealFYang : Thanks for the review. I'm so sorry, I think I missed the fact that kernel support for Vector extension is only available since 6.5. I agree that that UseRVV option should only be turned on if kernel support for is guaranteed first, so I think this issue can be closed.

@luhenry
Copy link
Member

luhenry commented Jun 27, 2024

We need to update our check from hwcap with something along the line of:

  vsetvli t0, zero, e8, m1, ta, ma
  csrr a0, vtype
  sgtz a0, a0
  ret

That will return whether v1.0 is supported since vsetvli t0, zero, e8, m1, ta, ma will set the vill flag on RVV 0.7.

@robehn
Copy link
Contributor

robehn commented Jun 27, 2024

We need to update our check from hwcap with something along the line of:

  vsetvli t0, zero, e8, m1, ta, ma
  csrr a0, vtype
  sgtz a0, a0
  ret

That will return whether v1.0 is supported since vsetvli t0, zero, e8, m1, ta, ma will set the vill flag on RVV 0.7.

As we can be context switched after vsetvli and thus execute csrr on another hart, without kernel support, the flag will not be set even if it's RVV 0.7. So this check is not reliable.

Or do 'unkown' flags propagate even if kernel is not aware of them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

4 participants