-
Notifications
You must be signed in to change notification settings - Fork 30
Fully enable usm_ndarray
in-place arithmetic operators
#1352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- A temporary will be allocated as necessary (i.e., when arrays overlap, are not going to be cast, and are not the same logical arrays) - Uses dedicated in-place kernels where they are implemented - Now called directly by Python operators - Removes _inplace method of BinaryElementwiseFunc class - Removes _find_inplace_dtype function
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1352/index.html |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_13 ran successfully. |
…p is possible - Broadcasting can change the values of strides without changing array shape
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_14 ran successfully. |
Use ExecutionPlacementError for CFD violations. Use ValueError is types of input are as expected, but values are not as expected.
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_24 ran successfully. |
Removed tests expecting error raised in case of overlapping inputs. Added tests guided by coverage report.
d9758ee
to
1d10f86
Compare
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_24 ran successfully. |
Since o1_dtype_kind_num > o2_dtype_kind_num, o1 can be not be weak boolean type, since it has the lowest kind number in the hierarchy.
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_25 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_26 ran successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready to go in, with follow-up PRs to add kernels for more in-place operations (divide, bitwise operations, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ndgrigorian . Useful change!
- Removed from test_floor_ceil_trunc, test_hyperbolic, test_trigonometric, and test_logaddexp - These tests would fail on GPU but never run on CPU, and therefore were not impacting the coverage - These tests focused on aspects of the BinaryElementwiseFunc class rather than the behavior of the operator
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_43 ran successfully. |
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.14.6dev3=py310ha25a700_44 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_0 ran successfully. |
This PR completes the implementation of in-place arithmetic operators.
With current behavior, in-place operations simply call the standard implementation and then set the original memory equal to the result. This sometimes leads to unnecessary memory allocation.
By reworking the
out
keyword, either input operand array can now be used as the output array, and copies are only required for casting and where race conditions must be considered (i.e., when theout
array overlaps only some of an input).