-
Notifications
You must be signed in to change notification settings - Fork 314
draft: arch: Add Hexagon HVX instructions #1999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
r? @folkertdev rustbot has assigned @folkertdev. Use |
|
cc @folkertdev I'm still working out how to get the CI to execute this, I should be able to make it work with |
|
Yeah I figured that would be a bit of a challenge, so more efficient to do that in parallel. |
folkertdev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General note: this looks largely machine-generated? We already have stdarch-gen-loongarch and stdarch-gen-arm, so it's possible to check that in.
Ok, will add the generation script here too. |
crates/core_arch/src/hexagon/hvx.rs
Outdated
| #[inline] | ||
| #[cfg_attr(target_arch = "hexagon", target_feature(enable = "hvxv60"))] | ||
| #[unstable(feature = "stdarch_hexagon", issue = "none")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For most targets we like adding a test that the intrinsic does in fact produce the expected instruction, e.g.
// avx512bw
#[cfg_attr(test, assert_instr(vpabsw))]
// wasm
#[cfg_attr(test, assert_instr(v128.load8x8_s))]
especially if you are machine-generating these that should hopefully be straightforward.
crates/core_arch/src/hexagon/hvx.rs
Outdated
| #[cfg_attr(target_arch = "hexagon", target_feature(enable = "hvxv60"))] | ||
| #[unstable(feature = "stdarch_hexagon", issue = "none")] | ||
| pub unsafe fn q6_v_vxor_vv(vu: HvxVector, vv: HvxVector) -> HvxVector { | ||
| vxor(vu, vv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a bit hard to deduce from the names, but I'm assuming this corresponds to simd_xor
https://doc.rust-lang.org/nightly/std/intrinsics/simd/fn.simd_xor.html
We generally prefer using functions from crate::intrinsics::simd (starting with crate is deliberate, using core:: can cause trouble when bootstrapping). It's much easier for us to then know what a function does, and as a bonus Miri will be able to run the intrinsics too.
When using intrinsics::simd it is even more important to check that these functions emit the right instruction. Sometimes additional optimizations are needed in LLVM to make that work.
I think it's fine to do this gradually, but for some of the simpler operations it's probably possible to do it from the start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for some of the simpler operations it's probably possible to do it from the start.
Sure - can do!
Ah - okay, I see that these are different from what I've implemented. I have a python script that takes a C/C++ toolchain header file as a source-of-truth. So @folkertdev should I implement that instead as a crate that parses a yaml file? |
|
The input is up to you, We would prefer having rust over python though, hopefully that is straightforward to port? |
It is, yeah.
Sorry, not sure if I follow. Add comments in the C header file used as an input? Or ... ? |
|
Taking Maybe this was unclear, but currently the workflow is that we don't manually update the generated rust code, but instead modify the inputs to the generator and re-run it. That is convenient when new instructions are added, and it is usually simpler to edit the input than the output. Anyhow, none of that is official, that's just what we have. So if there are good reasons to deviate from it that is something we can discuss. |
Ah okay I gotcha now. So my plan was to just include a hardcoded list in the script/rust program of the intrinsics that do map to the architecture-independent ones and the exceptions get lowered to the architecture-specific ones. |
|
Cool, that should work |
|
doing a quick close/open here to see what CI says now that the features are available |
|
hmm is there some issue with hexagon in |
This I have been noodling over the design of this PR and have several local changes. None of them were intended to address the |
|
Weird. We're using latest nightly here, maybe something was not synchronized? |
Yeah, it's not obvious to me yet. This was fixed a while ago (rust-lang/compiler-builtins#682). I'll need to spend some time trying to reproduce it and maybe something just needs to update an explicit compiler-builtins dependency to a newer release. |
|
Oh, yes, it's building |
|
Note that we don't use crates.io c-b anymore, it should only ever be coming via the submodule which is actively updated. Think this will still technically show as 0.1.160 though. |
folkertdev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any luck with figuring out why the build fails?
Replace print statements with assertions that verify the Gaussian 3x3 blur implementation against the Hexagon SDK reference algorithm. - Port exact SDK Gaussian3x3u8 implementation from: /opt/Hexagon_SDK/.../Examples/HVX/gaussian/src/gaussian.c - Verify specific output values [15, 16, 17, 18, 19, 20, 21, 22] for row 2, cols 1..9 with test pattern ((x + y*7) % 256) - Assert byte-averaging approximation exactly matches SDK reference - On Hexagon: verify HVX output matches both scalar approximation and SDK reference exactly
- Move regex compilations outside loops - Use Option::map and or_else instead of manual if-let chains - Use strip_prefix instead of manual starts_with + slice - Use !is_empty() instead of len() >= 1 - Combine consecutive str::replace calls
Updates cc crate to 1.2.55 which fixes macabi target triple handling for x86_64-apple-ios-macabi builds.
examples/gaussian.rs
Outdated
| //! qemu-hexagon -L <sysroot>/target/hexagon-unknown-linux-musl \ | ||
| //! target/hexagon-unknown-linux-musl/debug/gaussian | ||
|
|
||
| #![cfg_attr(target_arch = "hexagon", feature(stdarch_hexagon))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could use
#![target_arch = "hexagon"]
so that it applies to the module as a whole. Is there any particular reason not to do that, it seems like it would clean up a bunch of the conditionals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While making this update I remembered that there's an outstanding compiler issue that I should fix so that the example can be at parity with the intrinsics in the C/C++ reference code.
I'll make a rustc PR for that one and revisit this after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened rust-lang/rust#152552
I will put the HvxVectorPairs back into the gaussian.rs when that lands.
Add `#![cfg(target_arch = "hexagon")]` - Remove redundant #[cfg(target_arch = "hexagon")] from functions - Simplify import and constant cfg conditions - Remove non-Hexagon test code branch from main()
Note: this depends on #151500
Tracking issue #151523