Skip to content

Expanded doc-string of dpctl.tensor.usm_ndarray.to_device #990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 16, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 33 additions & 3 deletions dpctl/tensor/_usmarray.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -638,13 +638,43 @@ cdef class usm_ndarray:
res.array_namespace_ = self.array_namespace_
return res

def to_device(self, target_device):
def to_device(self, target):
"""
Transfer array to target device
Transfers this array to specified target device.

:Example:
.. code-block:: python

import dpctl
import dpctl.tensor as dpt

x = dpt.full(10**6, 2, dtype="int64")
q_prof = dpctl.SyclQueue(
x.sycl_device, property="enable_profiling")
# return a view with profile-enabled queue
y = x.to_device(q_prof)
timer = dpctl.SyclTimer()
with timer(q_prof):
z = y * y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried the example, but got an exception:

In [1]: import dpctl
   ...: import dpctl.tensor as dpt
   ...:
   ...: x = dpt.full(10**6, 2, dtype="int64")
   ...: q_prof = dpctl.SyclQueue(
   ...:     x.sycl_device, property="enable_profiling")
   ...: # return a view with profile-enabled queue
   ...: y = x.to_device(q_prof)
   ...: timer = dpctl.SyclTimer()
   ...: with timer(q_prof):
   ...:     z = y * y
   ...: print(timer.dt)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [1], in <cell line: 10>()
      9 timer = dpctl.SyclTimer()
     10 with timer(q_prof):
---> 11     z = y * y
     12 print(timer.dt)

TypeError: unsupported operand type(s) for *: 'dpctl.tensor._usmarray.usm_ndarray' and 'dpctl.tensor._usmarray.usm_ndarray'

but if I add import dpnp and run exactly the same code again, it will pass:

In [2]: import dpnp

In [3]: import dpctl
   ...: import dpctl.tensor as dpt
   ...:
   ...: x = dpt.full(10**6, 2, dtype="int64")
   ...: q_prof = dpctl.SyclQueue(
   ...:     x.sycl_device, property="enable_profiling")
   ...: # return a view with profile-enabled queue
   ...: y = x.to_device(q_prof)
   ...: timer = dpctl.SyclTimer()
   ...: with timer(q_prof):
   ...:     z = y * y
   ...: print(timer.dt)
(0.10838821064680815, 0.10794714400000001)

since, as I've got, dpctl will dispatch to dpnp.multiply then.
I believe import dpnp is missing in the example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this example presently requires dpnp, because dpctl.tensor.__mul__ falls back to dpnp if present or returns NotImplemented. This is transient and support for Python arithmetic operations is coming soon.

For the purpose of illustrating usm_ndarray.to_device I probably should have used z = foo(y) to emphasize that any offloading routine we want to time could be used.

print(timer.dt)

Args:
target: array API concept of target device.
It can be a oneAPI filter selector string,
an instance of :class:`dpctl.SyclDevice` corresponding to a
non-partitioned SYCL device, an instance of
:class:`dpctl.SyclQueue`, or a :class:`dpctl.tensor.Device`
object returned by :attr:`dpctl.tensor.usm_array.device`.

Returns:
A view if data copy is not required, and a copy otherwise.
If copying is required, it is done by copying from the original
allocation device to the host, followed by copying from host
to the target device.
"""
cdef c_dpctl.DPCTLSyclQueueRef QRef = NULL
cdef c_dpmem._Memory arr_buf
d = Device.create_device(target_device)
d = Device.create_device(target)
if (d.sycl_context == self.sycl_context):
arr_buf = <c_dpmem._Memory> self.usm_data
QRef = (<c_dpctl.SyclQueue> d.sycl_queue).get_queue_ref()
Expand Down