Add support of missing arguments in dpnp.count_nonzero
#1615
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The PR is about to fully get rid of fallback on numpy in
dpnp.count_nonzero
function. It requires to implement support of remaining arguments there.Another added improvement here is to calculate the result using kernel submitting to a device rather than copying input data to shared USM memory and performing computing on the host.
The implementation is done through
dpnp.sum
call for an input array casted to bool type.It is enough to demonstrate a huge performance improvements for the
dpnp.count_nonzero
call time. Meanwhile there is possible future improvement by implementation a SYCL extension with separate kernel specific tocount_nonzero
operation.