Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Numpy test failure on ppc64le architecture #3710

Closed
2 of 3 tasks
susilehtola opened this issue Feb 9, 2022 · 17 comments
Closed
2 of 3 tasks

[BUG]: Numpy test failure on ppc64le architecture #3710

susilehtola opened this issue Feb 9, 2022 · 17 comments

Comments

@susilehtola
Copy link

Required prerequisites

Problem description

After applying #3682, pybind 2.9.1 successfully builds with numpy 1.22.0 on several Fedora architectures.
https://koji.fedoraproject.org/koji/taskinfo?taskID=82606547

The tests fail, however, on ppc64le:

=================================== FAILURES ===================================
________________________________ test_recarray _________________________________
simple_dtype = dtype({'names': ['bool_', 'uint_', 'float_', 'ldbl_'], 'formats': ['?', '<u4', '<f4', '<f16'], 'offsets': [0, 4, 8, 16], 'itemsize': 32})
packed_dtype = dtype([('bool_', '?'), ('uint_', '<u4'), ('float_', '<f4'), ('ldbl_', '<f16')])
    def test_recarray(simple_dtype, packed_dtype):
        elements = [(False, 0, 0.0, -0.0), (True, 1, 1.5, -2.5), (False, 2, 3.0, -5.0)]
    
        for func, dtype in [
            (m.create_rec_simple, simple_dtype),
            (m.create_rec_packed, packed_dtype),
        ]:
            arr = func(0)
            assert arr.dtype == dtype
            assert_equal(arr, [], simple_dtype)
            assert_equal(arr, [], packed_dtype)
    
            arr = func(3)
            assert arr.dtype == dtype
            assert_equal(arr, elements, simple_dtype)
            assert_equal(arr, elements, packed_dtype)
    
            # Show what recarray's look like in NumPy.
            assert type(arr[0]) == np.void
            assert type(arr[0].item()) == tuple
    
            if dtype == simple_dtype:
>               assert m.print_rec_simple(arr) == [
                    "s:0,0,0,-0",
                    "s:1,1,1.5,-2.5",
                    "s:0,2,3,-5",
                ]
E               AssertionError: assert ['s:0,0,0,6.9...6.95329e-310'] == ['s:0,0,0,-0'... 's:0,2,3,-5']
E                 At index 0 diff: 's:0,0,0,6.95329e-310' != 's:0,0,0,-0'
E                 Use -v to get the full diff
../../tests/test_numpy_dtypes.py:205: AssertionError
=============================== warnings summary ===============================

Reproducible example code

No response

@susilehtola susilehtola added the triage New bug, unverified label Feb 9, 2022
@Skylion007
Copy link
Collaborator

@henryiii @rwgk Hurray for floating point comparisons... ;-;

@Skylion007 Skylion007 added bug compiler issue help wanted and removed triage New bug, unverified labels Feb 10, 2022
@rwgk
Copy link
Collaborator

rwgk commented Feb 10, 2022

Why is there a -0.0 in the input?

elements = [(False, 0, 0.0, -0.0), (True, 1, 1.5, -2.5), (False, 2, 3.0, -5.0)]

That is practically asking for trouble.

@henryiii
Copy link
Collaborator

Can't we make these float arrays and then compare with pytest.approx(arr)?

@rwgk
Copy link
Collaborator

rwgk commented Feb 10, 2022

Can't we make these float arrays and then compare with pytest.approx(arr)?

Is it exercising m.print_rec_simple or just using it because it seemed convenient?

@henryiii
Copy link
Collaborator

That is practically asking for trouble.

Isn’t this what tests are for? :)

@charlesbeattie
Copy link

Do we know why the behaviour changed?

@henryiii
Copy link
Collaborator

Is this a change in NumPy on that arch? (in other words, does 1.21 produce the old behavior? Do we know?)

@susilehtola
Copy link
Author

susilehtola commented Feb 10, 2022 via email

@rwgk
Copy link
Collaborator

rwgk commented Feb 11, 2022

(sorry for posting this in the wrong window before)

The -0 was introduced in this commit:

f7f5bc8

@jagerman Do you remember if there was a special reason for the -? Would there be anything lost if we changed it to e.g. -1?

@henryiii
Copy link
Collaborator

FYI, NumPy is very broken on ppc64le:

$ docker run --platform linux/ppc64le --rm -it continuumio/miniconda3:latest bash
# conda install numpy
# python
>>> import numpy as np
>>> np.array([1., 2., 3., 1.00001]).astype(np.float32)
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 4.5436204e+36],
      dtype=float32)

No pybind11 required. ;)

numpy/numpy#21062

@rwgk
Copy link
Collaborator

rwgk commented Feb 16, 2022

Whoa! OMG. Close this?

@Skylion007
Copy link
Collaborator

Ping @susilehtola

@henryiii
Copy link
Collaborator

See numpy/numpy#20964 for more info.

@QuLogic
Copy link
Contributor

QuLogic commented Feb 19, 2022

Those 'NumPy' bugs are QEMU bugs. Fedora builds on real ppc64le hardware, and your sample outputs array([1. , 2. , 3. , 1.00001], dtype=float32) as expected there.

Rawhide/F36 is on GCC 12 though, so it's still possibly a compiler bug.

@QuLogic
Copy link
Contributor

QuLogic commented Feb 19, 2022

Similar bugs in Cython and cffi on ppc64le. It possibly has to do with the switch to IEEE long double; is pybind11 doing anything special to handle that or is it likely a NumPy issue?

Note, none of NumPy's tests fail on ppc64le.

@QuLogic
Copy link
Contributor

QuLogic commented Mar 21, 2022

@susilehtola there were some fixes in gcc that fixed Cython, and partially cffi, so could you try rebuilding pybind11?

@susilehtola
Copy link
Author

@susilehtola there were some fixes in gcc that fixed Cython, and partially cffi, so could you try rebuilding pybind11?

The package indeed built now without problems on all architectures. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants