In-place addition, multiplication, subtraction of usm_ndarrays #1237

ndgrigorian · 2023-06-12T09:10:14Z

Closes #1229
This pull request implements in-place addition, subtraction, and multiplication of usm_ndarray instances.

By avoiding the additional allocation of an output array, the performance advantage compared compared to applying the operator with a scalar, a la dpt.add(x, 2), is considerable.

For example:

In [8]: X = dpt.ones((16768, 16768), dtype="f4")

In [9]: %time X += 2
CPU times: user 58.5 ms, sys: 78.1 ms, total: 137 ms
Wall time: 133 ms

In [10]: %time dpt.add(X, 2)
CPU times: user 238 ms, sys: 147 ms, total: 385 ms
Wall time: 392 ms

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
If this PR is a work in progress, are you opening the PR as a draft?

github-actions · 2023-06-12T09:30:20Z

View rendered docs @ https://intelpython.github.io/dpctl/pulls/1237/index.html

coveralls · 2023-06-12T19:08:19Z

coverage: 84.117% (+0.007%) from 84.11% when pulling da5f2f7 on inplace-operator-initial-impl into 43f3b7b on master.

github-actions · 2023-06-12T19:34:36Z

Array API standard conformance tests for dpctl=0.14.3dev3=py310h7bf5fec_36 ran successfully.
Passed: 363
Failed: 637
Skipped: 119

github-actions · 2023-06-12T21:20:06Z

Array API standard conformance tests for dpctl=0.14.3dev3=py310h7bf5fec_38 ran successfully.
Passed: 372
Failed: 628
Skipped: 119

github-actions · 2023-06-12T22:03:09Z

Array API standard conformance tests for dpctl=0.14.3dev3=py310h7bf5fec_39 ran successfully.
Passed: 373
Failed: 627
Skipped: 119

dpctl/tensor/_elementwise_common.py

github-actions · 2023-06-13T08:55:26Z

Array API standard conformance tests for dpctl=0.14.3dev3=py310h7bf5fec_42 ran successfully.
Passed: 375
Failed: 625
Skipped: 119

dpctl/tensor/libtensor/include/kernels/elementwise_functions/common_inplace.hpp

- This change fixes some failing tests - Added additional tests for in-place addition

- Adjusted tests for in-place addition to improve coverage

… themselves - Now makes a copy and adds the copy to the original array

- functionality such as binop(x, y, out=x) now possible, with some edge cases still WIP

vtavana

Thank you, @ndgrigorian

github-actions · 2023-06-13T17:15:08Z

Array API standard conformance tests for dpctl=0.14.3dev3=py310h7bf5fec_50 ran successfully.
Passed: 386
Failed: 614
Skipped: 119

github-actions · 2023-06-13T20:11:34Z

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

github-actions · 2023-06-13T21:07:05Z

Array API standard conformance tests for dpctl=0.14.3dev4=py310h7bf5fec_10 ran successfully.
Passed: 388
Failed: 612
Skipped: 119

AlexanderKalistratov

Just small comments. You can ignore it, if you want.

AlexanderKalistratov · 2023-06-23T10:43:07Z

dpctl/tensor/libtensor/include/kernels/elementwise_functions/common_inplace.hpp

+                sycl::vec<argT, vec_sz> arg_vec;
+                sycl::vec<resT, vec_sz> res_vec;
+
+#pragma unroll


Have you tried to compare performance against something simple like this:

int grid = ndit.get_group_linear_id(); int llid = ndit.get_local_linear_id(); int local_size = ndit.get_local_range(0); // is this 1-d task? int base = grid*local_size*vec_sz*n_vecs; #pragma unroll for (int i = 0; i < vec_sz*n_vecs; ++i) size_t k = base + local_size*i + llid; if (k < nitems) op(lhs[k], rhs[k]);

I'm pretty sure compiler can place load and store operations for you. You don't need to put it manually.

You probably need to split in two loops (over n_vecs and over vec_sz)

AlexanderKalistratov · 2023-06-23T10:45:18Z

dpctl/tensor/libtensor/include/kernels/elementwise_functions/common_inplace.hpp

+                }
+            }
+            else {
+                for (size_t k = base + sg.get_local_id()[0]; k < nelems_;


This one doesn't feels right.
Are you iterating over all items inside one subgroup?
You are comparing against nelems_, not against vec_sz*n_vecs.

Have you validated this version?

AlexanderKalistratov · 2023-06-23T10:52:38Z

dpctl/tests/elementwise/test_add.py

+    X = dpt.zeros((10, 10), dtype=dtype, sycl_queue=q)
+    dt_kind = X.dtype.kind
+    if dt_kind in "ui":
+        X += int(0)


You really shouldn't test +=0 and *=1. Use something, what would modify original data. E.g. +=1 and *=2

Also, I don't see results checking

AlexanderKalistratov · 2023-06-23T10:56:39Z

dpctl/tests/elementwise/test_add.py

+
+    sz = 127
+    ar1 = dpt.ones(sz, dtype=op1_dtype)
+    ar2 = dpt.ones_like(ar1, dtype=op2_dtype)


It is much better to use something like arange or random data if possible. Testing on arrays where all elements are the same could play a bad joke with you.

AlexanderKalistratov · 2023-06-23T10:59:59Z

dpctl/tests/elementwise/test_add.py

+    v = dpt.arange(5, dtype="i4")
+
+    m += v
+    assert (dpt.asnumpy(m) == np.arange(1, 6, dtype="i4")[np.newaxis, :]).all()


This one good one 👍

but it tests only one dtype.

ndgrigorian force-pushed the inplace-operator-initial-impl branch from e9a4f30 to cd3f169 Compare June 12, 2023 18:43

ndgrigorian marked this pull request as ready for review June 12, 2023 20:05

ndgrigorian requested review from npolina4 and vtavana June 12, 2023 20:05

ndgrigorian changed the title ~~In-place addition of usm_ndarrays~~ In-place addition, multiplication, subtraction of usm_ndarrays Jun 12, 2023

vtavana reviewed Jun 12, 2023

View reviewed changes

dpctl/tensor/_elementwise_common.py Show resolved Hide resolved

dpctl/tensor/_elementwise_common.py Outdated Show resolved Hide resolved

ndgrigorian requested a review from vtavana June 13, 2023 08:37

oleksandr-pavlyk reviewed Jun 13, 2023

View reviewed changes

dpctl/tensor/libtensor/include/kernels/elementwise_functions/common_inplace.hpp Outdated Show resolved Hide resolved

ndgrigorian added 9 commits June 13, 2023 09:18

Implements in-place addition

a6040cc

Adds tests for inplace addition

9f942e0

In-place operators now enabled for an array with itself

ccf66de

- This change fixes some failing tests - Added additional tests for in-place addition

Elementwise functions now check writable flag of destination

034ab01

Implements in-place multiplication and subtraction

cb87b68

Added tests for in-place multiplication, subtraction

0f9c857

- Adjusted tests for in-place addition to improve coverage

Changed logic for in-place arithmetic operations on usm_ndarrays with…

e07d5f0

… themselves - Now makes a copy and adds the copy to the original array

In-place operations enabled in standard binary operators

cf4049a

- functionality such as binop(x, y, out=x) now possible, with some edge cases still WIP

Corrected typdef typos to typedef

da5f2f7

ndgrigorian force-pushed the inplace-operator-initial-impl branch from 2a98e1d to da5f2f7 Compare June 13, 2023 16:18

vtavana approved these changes Jun 13, 2023

View reviewed changes

ndgrigorian merged commit 81553f8 into master Jun 13, 2023

AlexanderKalistratov reviewed Jun 23, 2023

View reviewed changes

ndgrigorian deleted the inplace-operator-initial-impl branch August 8, 2023 07:55

In-place addition, multiplication, subtraction of usm_ndarrays #1237

In-place addition, multiplication, subtraction of usm_ndarrays #1237

Uh oh!

Conversation

ndgrigorian commented Jun 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 12, 2023

Uh oh!

coveralls commented Jun 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 12, 2023

Uh oh!

github-actions bot commented Jun 12, 2023

Uh oh!

github-actions bot commented Jun 12, 2023

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 13, 2023

Uh oh!

Uh oh!

vtavana left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 13, 2023

Uh oh!

github-actions bot commented Jun 13, 2023

Uh oh!

github-actions bot commented Jun 13, 2023

Uh oh!

AlexanderKalistratov left a comment

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jun 23, 2023

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jun 23, 2023

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jun 23, 2023

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jun 23, 2023

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jun 23, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ndgrigorian commented Jun 12, 2023 •

edited

Loading

coveralls commented Jun 12, 2023 •

edited

Loading