Skip to content

test_serialize_numba: Workaround issue with np.empty_like in NumPy 1.23 #7089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 30, 2022

Conversation

gmarkall
Copy link
Contributor

In NumPy 1.23, the strides of empty arrays are 0 instead of the item size, due to numpy/numpy#21477 - however, np.empty_like seems to create non-zero-strided arrays from a zero-strided empty array, and copying to the host from a device array with zero strides fails a compatibility check in Numba.

This PR works around the issue by calling copy_to_host() with no arguments, allowing Numba to create an array on the host that is compatible with the device array - the resulting implementation is functionally equivalent and slightly simpler, so I believe this change could remain permanant rather than requiring a revert later.

  • Tests added / passed
  • Passes pre-commit run --all-files

In NumPy 1.23, the strides of empty arrays are 0 instead of the item
size, due to numpy/numpy#21477 - however,
`np.empty_like` seems to create non-zero-strided arrays from a
zero-strided empty array, and copying to the host from a device array
with zero strides fails a compatibility check in Numba.

This commit works around the issue by calling `copy_to_host()` with no
arguments, allowing Numba to create an array on the host that is
compatible with the device array - the resulting implementation is
functionally equivalent and slightly simpler, so I believe this change
could remain permanant rather than requiring a revert later.
@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@pentschev
Copy link
Member

add to allowlist

@pentschev
Copy link
Member

ok to test

@quasiben
Copy link
Member

Just to confirm, we expect this to also work with NumPy 1.22.X as well, correct ?

@gmarkall
Copy link
Contributor Author

Just to confirm, we expect this to also work with NumPy 1.22.X as well, correct ?

Yes.

@quasiben quasiben mentioned this pull request Sep 29, 2022
5 tasks
@gmarkall
Copy link
Contributor Author

Presumably the fails here are unrelated to this PR?

FAILED distributed/deploy/tests/test_local.py::test_adapt_then_manual - asser...
FAILED distributed/tests/test_steal.py::test_balance_multiple_to_replica[True]
FAILED distributed/tests/test_steal.py::test_balance_multiple_to_replica[False]

@quasiben quasiben mentioned this pull request Sep 29, 2022
@quasiben
Copy link
Member

test_adapt_then_manual was just recently filed as being flaky #7079 . I don't see anything recent regarding test_balance_multiple_to_replica. It's unrelated to this PR but I want to be extra cautious. Should things pass I'll plan to merge. Thanks @gmarkall for the immediate attention

@github-actions
Copy link
Contributor

github-actions bot commented Sep 29, 2022

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       15 files  ±0         15 suites  ±0   6h 29m 8s ⏱️ - 5m 20s
  3 144 tests ±0    3 058 ✔️ +5    86 💤 +1  0  - 6 
23 271 runs  ±0  22 365 ✔️ +7  906 💤 ±0  0  - 7 

Results for commit 84d776a. ± Comparison against base commit ee43309.

♻️ This comment has been updated with latest results.

@quasiben
Copy link
Member

I think this is ok to merge in but I will wait for until morning to do so. In the mean time I filed to #7092 to track the unrelated test failure we see here: test_drop_with_waiter .

@quasiben
Copy link
Member

All tests are passing now. Merging in.

@gmarkall
Copy link
Contributor Author

Related Numba issue: numba/numba#8477

gmarkall added a commit to gmarkall/numba that referenced this pull request Oct 3, 2022
Since NumPy 1.23, it is possible for zero-length ndarrays to have
different strides (e.g. `(0,)`, and `(8,)`). If we attempt to copy from
a zero-length device array to a zero-length host array where the strides
differ, our compatibility check fails because it compares strides.

This commit fixes the issue by only considering strides when checking
compatibility of nonzero-length arrays. I believe this to be valid
because the following works normally with NumPy 1.23:

```python
import numpy as np

ary1 = np.arange(0)
ary2 = np.ndarray((0,), buffer=ary1.data)
ary3 = np.empty_like(ary2)

ary3[:] = ary2
ary3[...] = ary2
np.copyto(ary2, ary3)
```

i.e. copying zero-length arrays with different strides generally works
as expected.

The included test is written in such a way that it should test this
change in behaviour regardless of the installed NumPy version - we
explicitly construct zero-length device and host arrays with differing
strides. The additional sanity check ensures that the host array has the
strides we expect, just in case there is some version of NumPy in which
setting the strides explicitly didn't result in the expected strides - I
have observed that requesting nonzero strides for a zero length array
can result still in zero strides (a separate but related behaviour), so
this sanity check is provided to account for any other unexpected
behaviour of this nature. I have tested locally with NumPy 1.22 and 1.23
(pre- and post-changes to strides).

See also dask/distributed#7089 where a
workaround for an observation of this issue was needed. This would not
be needed with the fix in this commit.
gjoseph92 pushed a commit to gjoseph92/distributed that referenced this pull request Oct 31, 2022
…ask#7089)

In NumPy 1.23, the strides of empty arrays are 0 instead of the item
size, due to numpy/numpy#21477 - however,
`np.empty_like` seems to create non-zero-strided arrays from a
zero-strided empty array, and copying to the host from a device array
with zero strides fails a compatibility check in Numba.

This commit works around the issue by calling `copy_to_host()` with no
arguments, allowing Numba to create an array on the host that is
compatible with the device array - the resulting implementation is
functionally equivalent and slightly simpler, so I believe this change
could remain permanant rather than requiring a revert later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants