Skip to content

dpnp.sum(x, axis=0) is very slow on integral 64-bit types #1539

Closed
@oleksandr-pavlyk

Description

@oleksandr-pavlyk
In [1]: import dpnp

In [2]: x = dpnp.reshape(dpnp.arange(1, 1 + 245612 * 24, 1, dtype="i8"), (245612, 24))

In [3]: %time y = dpnp.sum(x, axis=0, dtype="i8")
CPU times: user 1.13 s, sys: 1.11 s, total: 2.24 s
Wall time: 2.43 s

In [4]: %time y = dpnp.sum(x, axis=0, dtype="i8")
CPU times: user 1.22 s, sys: 1.03 s, total: 2.25 s
Wall time: 2.44 s

In [5]: import dpctl.tensor as dpt

In [7]: x_dpt = dpt.asarray(x)

In [8]: %time y_dpt = dpt.sum(x_dpt, axis=0, dtype="i8")
CPU times: user 156 ms, sys: 3.52 ms, total: 160 ms
Wall time: 175 ms

In [9]: %time y_dpt = dpt.sum(x_dpt, axis=0, dtype="i8")
CPU times: user 1.82 ms, sys: 1.16 ms, total: 2.97 ms
Wall time: 3.06 ms

@AlexanderKalistratov

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions