Skip to content

Implements dpctl.tensor.any and dpctl.tensor.all #1204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 16, 2023

Conversation

ndgrigorian
Copy link
Collaborator

@ndgrigorian ndgrigorian commented May 9, 2023

This pull request implements dpctl.tensor.all and dpctl.tensor.any.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you opening the PR as a draft?

@ndgrigorian ndgrigorian force-pushed the boolean-reductions branch from 3edc280 to 9d05a14 Compare May 9, 2023 03:16
@coveralls
Copy link
Collaborator

coveralls commented May 9, 2023

Coverage Status

Coverage: 83.41% (+0.09%) from 83.322% when pulling d1d67b9 on boolean-reductions into 1320d39 on master.

@github-actions
Copy link

github-actions bot commented May 9, 2023

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_50 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@github-actions
Copy link

github-actions bot commented May 9, 2023

@github-actions
Copy link

github-actions bot commented May 9, 2023

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_50 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@oleksandr-pavlyk oleksandr-pavlyk self-requested a review May 9, 2023 21:43
- Tests refactored into more generic tests parametrized by function and identity
- Randrange used to make tests more robust
- Tests now cover branch in kernel for wide vs. skinny arrays
@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_52 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_55 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@oleksandr-pavlyk
Copy link
Contributor

This PR requires docstrings

- Now initialized by a single function call
- Moved boolean reduction template into header
@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_56 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_57 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

This case now circumvents the call to permute_dims completely
Tests were updated to reflect this change and cover both branches
Also added a test for the axis=() case
@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_62 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@oleksandr-pavlyk
Copy link
Contributor

I think this PR is ready. The last change helped as witnessed by performance gap between axis=None and axis=(0,1,2,) below:

In [1]: import dpctl.tensor as dpt

In [2]: dpt.asarray([float('inf'),]*30 + [float('nan'),] * 12 + [-float('inf'), ] * 18, device="cpu").shape
Out[2]: (60,)

In [3]: x = dpt.reshape(dpt.asarray([float('inf'),]*30 + [float('nan'),] * 12 + [-float('inf'), ] * 18, device="cpu"), (5,4,3))

In [4]: x.flags
Out[4]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  WRITABLE : True

In [5]: dpt.all(x)
Out[5]: usm_ndarray(True)

In [6]: %timeit dpt.all(x)
235 µs ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [7]: %timeit dpt.all(x, axis=(0,1,2))
246 µs ± 27.5 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [8]: %timeit dpt.all(x, axis=None)
254 µs ± 10.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [9]: %timeit dpt.all(x, axis=(0,1,2))
248 µs ± 31.5 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [10]: %timeit dpt.all(x, axis=None)
205 µs ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [11]: %timeit dpt.all(x, axis=(0,1,2))
251 µs ± 17 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

@oleksandr-pavlyk oleksandr-pavlyk marked this pull request as ready for review May 16, 2023 13:11
@ndgrigorian ndgrigorian merged commit 634348b into master May 16, 2023
@github-actions
Copy link

Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞

@github-actions
Copy link

Array API standard conformance tests for dpctl=0.14.3dev1=py310h76be34b_62 ran successfully.
Passed: 157
Failed: 843
Skipped: 116

@ndgrigorian ndgrigorian deleted the boolean-reductions branch July 27, 2023 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants