Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test for float32_cvt_i32 #3178

Merged
merged 5 commits into from
Oct 29, 2024

Conversation

gretay-js
Copy link
Contributor

One of the C stubs in flambda-backend/tests/simd/ uses incompatible types. There is a macro that calls _mm_extract_ps which expects __m128 argument, but one of the uses of the macro is with the result of _mm_cvtps_epi32 which is of type __m128i. This issue was originally reported by @bclement-ocp.

This PR attempts to fix it:

  • Add another macro that uses _mm_extract_epi32 instead of _mm_extract_ps for _mm_cvtps_epi32.
  • Add standard to c_flags in this test directory, to get the warning about "incompatible type" in the CI.

@gretay-js gretay-js added the simd SIMD support label Oct 23, 2024
@gretay-js gretay-js requested a review from TheNumbat October 23, 2024 16:21
Copy link
Contributor

@TheNumbat TheNumbat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks fine, but need to figure out what's changed in the macOS CI

@xclerc
Copy link
Contributor

xclerc commented Oct 24, 2024

(Given the timing, I wonder whether it could
simply be an upgrade to macOS Sequoia.)

@gretay-js gretay-js force-pushed the fix_simd_test_float32_cvt_i32 branch from 61e4123 to a34e7e4 Compare October 24, 2024 10:01
@gretay-js gretay-js force-pushed the fix_simd_test_float32_cvt_i32 branch from a34e7e4 to 6916882 Compare October 28, 2024 14:47
@gretay-js
Copy link
Contributor Author

Fix looks fine, but need to figure out what's changed in the macOS CI

The problem was with operations on NaNs. The problem was exposed by turning optimizations on: including :standard in flags implies -O2. For example, add of two NaNs is not commutative. An add operation on NaNs returns a NaN whose payload is one of the payloads of its inputs, according to IEEE 745, as far as I understand it. The tests we have are sensitve to NaN representation, and building the stubs.c with optimizations produces a slightly different result on MacOS. My proposed fix is to avoid NaNs in random tests (see 6916882). An alternative fix would be to consider equivalence up to NaNs as is already implemented in eqf'. @TheNumbat let me know which one you prefer.

(Given the timing, I wonder whether it could simply be an upgrade to macOS Sequoia.)

Looks like macos-15 on x86_64 is only available in a large image, which requires a subscription. I'll make a PR to upgrade to a small macos-15 on arm64 (some tests are failing) and we can discuss the trade-offs.

@TheNumbat
Copy link
Contributor

I see, thanks. I would prefer to continue testing that the nan behavior is consistent with the C stubs, since our OCaml intrinsics don't depend on compiler flags - do you know what C operation(s) have a different result with -O2? We don't enable fast-math so the change is a little surprising.

@gretay-js
Copy link
Contributor Author

There were a few differences. At least add and cmp unordered/ordered had different results. Some differences are due to producing a signalling NaN from in the random tests, and then operations turn it into quiet NaN or not, depending on optimization levels.

@TheNumbat
Copy link
Contributor

TheNumbat commented Oct 28, 2024

I think we determined that the difference comes from clang flipping the src/dst register order or using a different instruction with -O2. I had assumed the C intrinsics guaranteed translation to a specific instruction/reg order, but apparently not, so we indeed can't rely on consistent NaN patterns from the C stubs. Let's change the tests to use eqf' so we still test NaNs but treat all NaNs as equivalent.

@gretay-js gretay-js merged commit abbb251 into ocaml-flambda:main Oct 29, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
simd SIMD support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants