Description
It would be nice having an .offset
attribute that gives (in bytes or in number of items with the given dtype) the position of the first address of a view in the underlying usm_data
buffer. Or have usm_ndarray
automatically register it from a view when using the buffer=
param. Or is there already a practical way for this maybe ?
Example: I have a usecase (radix sorting) where I want to re-interpret a float32
buffer into a uint32
buffer, it's possible with:
import numpy as np
import dpctl.tensor as dpt
array = dpt.arange(10, dtype=np.float32)
array_uint32 = dpt.usm_ndarray(shape=(10,), dtype=np.uint32, buffer=array)
print(array)
print(array_uint32)
output:
[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
[ 0 1065353216 1073741824 1077936128 1082130432 1086324736 1088421888 1090519040 1091567616]
now there are cases where I'd like to only reinterpret part of the buffer:
sub_array = array[-2:]
sub_array_uint32_wrong = dpt.usm_ndarray(shape=(2,), dtype=np.uint32, buffer=sub_array)
print(sub_array_uint32_wrong)
but this doesn't work,
[ 0 1065353216]
I get the first two values of the buffer rather than the two lasts, meaning that when passing buffer=sub_array
it's not the buffer actually used by the view that is registered, but the whole base buffer given by sub_array.usm_data
.
However the offset option can make it work:
sub_array_uint32_good = dpt.usm_ndarray(shape=(2,), dtype=np.uint32, buffer=sub_array, offset=8)
print(sub_array_uint32_good)
[1090519040 1091567616]
which is OK, but since I don't see a way to get the offset
value from an usm_ndarray
attribute, it means that the user code must maintain and pass through layers of code an additional offset
quantity for views in cases where such conversion is needed later on.
It would be nice if either usm_ndarray
can work from a view without the need for passing explicitly offset
, or if usm_ndarray
s could expose an offset
attribute to make this possible:
sub_array_uint32_good = dpt.usm_ndarray(shape=(2,), dtype=np.uint32, buffer=sub_array, offset=sub_array.offset)
which of course currently fails with:
AttributeError: 'dpctl.tensor._usmarray.usm_ndarray' object has no attribute 'offset'
For numpy arrays, numpy offer this kind a tool with byte_bounds
which is another nice way of exposing it.