-
Notifications
You must be signed in to change notification settings - Fork 299
Fixes for LoongArch LSX + fast math #1369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fixes for LoongArch LSX + fast math #1369
Conversation
__lsx_vftintrz_w_d accepts two __m128d arguments, so it's
should be called with zero_f64 that is declared.
This fixes the following compilation error that I get when
compiling current simde master for loongarch64-linux-gnu
with gcc 14.3.1 and `-Ofast -mlsx -mlasx` in CFLAGS:
../test/x86/avx512/../../../simde/x86/sse2.h: In function ‘simde__m128i simde_mm_cvttpd_epi32(simde__m128d)’:
../test/x86/avx512/../../../simde/x86/sse2.h:3736:39: error: ‘zero_i64’ was not declared in this scope; did you mean ‘zero_f64’?
3736 | r_.lsx_i64 = __lsx_vftintrz_w_d(zero_i64, simde__m128d_to_private(a).lsx_f64);
| ^~~~~~~~
| zero_f64
Signed-off-by: Ivan A. Melnikov <iv@altlinux.org>
Similarly to what other architectures do, __lsx_vftintrz_w_s should be used when both SIMDE_FAST_CONVERSION_RANGE and SIMDE_FAST_NANS are declared, not just stored to a temporary and lost. Signed-off-by: Ivan A. Melnikov <iv@altlinux.org>
|
@HecaiYuan, @mr-c, please take a look. |
mr-c
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR, @iv-m ; please address the issues at https://github.com/simd-everywhere/simde/actions/runs/21027953168/job/60456937165?pr=1369
Well, I just submitted fixes for two of the most obvious issues I found( Other issues require more investigation. For example, the error from the CI: It looks like GCC bug to me: I'm not sure I'm qualified enough and have time to address all the issues there rn. I probably can disable the preporcessor branches for optimizations that break CI -- that's more or less what I did for my local simde copy -- and hope that LoongArch community (me included) will eventually implement them back. If this is the way to go, what is the preferred conventions for that? Something along the lines of |
Please file issues with GCC, then we can add a entry near Line 1058 in 613c365
SIMDE_BUG_GCC_NNNN only for circumstances where it is active.
Then we can add Line 3314 in 613c365
You can also experiment with adding cast using |
Fix a couple of errors I found when testing current simde master on my loongarch64 machine with GCC 14.3.1 and
-Ofast -mlsx -mlasxinCFLAGSandCXXFLAGS.