Copying from numpy into usm_ndarray is unnecessarily slow

Using the enclosed script `time_copy.py`  it is clear that `dpctl.tensor.usm_ndarray.__setitem__`  is not efficient when copying C-contiguous host buffer into C-contiguous USM array:

```text
(idp_2021.4) [13:25:40 ansatnuc04 python]$ python time_copy.py
Wall time:  0.00044969748705625534  sec.
Device time:  0.00010292000000000001  sec.
Wall time:  4.959066528826952  sec.
Device time:  0.717467438  sec.
```

This is likely because copying is done an element per kernel, and contiguity is not taken advantage of.

```text
(idp_2021.4) [13:27:17 ansatnuc04 python]$ python -c "import dpctl; print(dpctl.__version__)"
0.12.0dev1+91.gb7a15ed9
```

<details>
<summary>time_copy.py script</summary>

```python
# time_copy.py
import numpy as np

import dpctl
import dpctl.tensor as dpt
import dpctl.memory as dpm

n = 8 * 1024
host_array = np.random.random(size=n)

q = dpctl.SyclQueue("gpu", property="enable_profiling")

timer0 = dpctl.SyclTimer(time_scale=1) # report duration in seconds
with timer0(q):
    # copying using queue
    usm_array = dpt.empty(host_array.shape,
                          dtype=host_array.dtype,
                          sycl_queue=q)
    usm_array.usm_data.copy_from_host(host_array.reshape((-1)).view("u1"))

host_time, device_time = timer0.dt

print("Wall time: ", host_time, " sec.")
print("Device time: ", device_time, " sec.")

timer1 = dpctl.SyclTimer(time_scale=1) # report duration in seconds
with timer1(q):
    # copying using queue
    usm_array = dpt.asarray(host_array, sycl_queue=q)

host_time, device_time = timer1.dt

print("Wall time: ", host_time, " sec.")
print("Device time: ", device_time, " sec.")
```
</detail>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Copying from numpy into usm_ndarray is unnecessarily slow #723

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Copying from numpy into usm_ndarray is unnecessarily slow #723

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions