Skip to content

Feature/advanced indexing #1095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 61 commits into from
Mar 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
831a5cb
Fixed typo in docstring
oleksandr-pavlyk Feb 11, 2023
402c1d6
Fixed TODO in utility _zero_like
oleksandr-pavlyk Feb 14, 2023
d7fc400
Extended _basic_slice_meta to process advanced indexing specs
oleksandr-pavlyk Feb 14, 2023
bcc305b
Added prototype with mock implementation of extract/place/take/put
oleksandr-pavlyk Feb 14, 2023
a1791e2
Change mod of _slicing.pxi to be non-executable
oleksandr-pavlyk Feb 14, 2023
ea339eb
Added ExecutionPlacementError
oleksandr-pavlyk Feb 14, 2023
483a423
Factored out dpt.dtype and dpt.bool, etc. definitions into dedicated …
oleksandr-pavlyk Feb 14, 2023
efcd9cb
Deployed lazy implementation of advanced indexing to develop tests
oleksandr-pavlyk Feb 14, 2023
04e4b51
Ensure that rhs can be a scalar or numpy array
oleksandr-pavlyk Feb 14, 2023
6589dfa
Added new test file to test indexing
oleksandr-pavlyk Feb 14, 2023
d66c494
Added _constants, and extended advanced indexing tests
oleksandr-pavlyk Feb 15, 2023
61917a5
More tests for advanced indexing
oleksandr-pavlyk Feb 15, 2023
c2d7928
Removed proto/ folder
oleksandr-pavlyk Feb 15, 2023
832a981
Implemented advanced indexing kernels
ndgrigorian Feb 22, 2023
e0ae1ff
Merge remote-tracking branch 'origin/master' into feature/advanced-in…
oleksandr-pavlyk Feb 24, 2023
6239eb7
Changes to advanced indexing
ndgrigorian Feb 24, 2023
a0895be
Changes to advanced_indexing.cpp
ndgrigorian Feb 24, 2023
728b8e6
Fixed missing cast for indices clip/wrap
ndgrigorian Feb 25, 2023
333f9e6
Fixed error from dpt.flip(dpt.arange(5))[dpt.arange(2)]
oleksandr-pavlyk Feb 26, 2023
ab79d84
More tests for advanced indexing
oleksandr-pavlyk Feb 26, 2023
a966830
Fixed error from dpt.flip(dpt.arange(5))[dpt.arange(2)]
oleksandr-pavlyk Feb 26, 2023
8523d8e
More tests for advanced indexing
oleksandr-pavlyk Feb 26, 2023
81ba473
Adding basic take, and basic put tests
oleksandr-pavlyk Feb 26, 2023
1f61206
Merge branch 'feature/advanced-indexing' into take-put-impl
oleksandr-pavlyk Feb 26, 2023
7c0c6f0
Turn debugging on for test_windows test run
oleksandr-pavlyk Feb 26, 2023
156f7f0
Added several array indexing tests
ndgrigorian Feb 27, 2023
d42b019
Put calls in tests corrected, organized put logic
ndgrigorian Feb 27, 2023
877c3c7
Test fixes
ndgrigorian Feb 27, 2023
e296d87
Moved advanced_indexing pointer range validation
ndgrigorian Feb 27, 2023
0cf7ba4
Fixed typo in advanced_indexing kernels
ndgrigorian Feb 28, 2023
fc46303
Renamed advance_indexing.*pp into integer_advanced_indexing.*pp
oleksandr-pavlyk Feb 27, 2023
51e0fbb
Initialize packed shape+strides data with zeros
oleksandr-pavlyk Feb 27, 2023
84ba81a
Ensure that indices are also kept alive
oleksandr-pavlyk Feb 27, 2023
56bb65f
Moved ctx creation into host-task-dispatching handler function.
oleksandr-pavlyk Feb 28, 2023
24d7839
Prevent dangling host tasks in indexing functions
ndgrigorian Mar 1, 2023
f84239f
Use py::gil_scoped_acquire instead of PyGILState_Ensure.
oleksandr-pavlyk Mar 1, 2023
b69a415
Make both _take and _put effectively synchronous
oleksandr-pavlyk Mar 1, 2023
f35734b
Simplified host_tasks in _put
ndgrigorian Mar 1, 2023
7fee9e4
Reordered copies in _take and _put
ndgrigorian Mar 2, 2023
d5a49c2
Reordered waits
oleksandr-pavlyk Mar 2, 2023
f06dde5
Add wait for every host task submitted.
oleksandr-pavlyk Mar 2, 2023
1387634
Advanced indices don't broadcast if 1 array passed
ndgrigorian Feb 28, 2023
d0eb7cf
Take and put tweaks
ndgrigorian Mar 2, 2023
1e67943
Fixed WrapIndex class returning negative indices
ndgrigorian Mar 2, 2023
ac9072f
Import formatting corrected in usm_ndarray getitem
ndgrigorian Mar 2, 2023
d47fbf0
Whitespace in usm_ndarray getitem imports
ndgrigorian Mar 2, 2023
138a023
Merge branch 'master' into take-put-impl
ndgrigorian Mar 2, 2023
db84c42
Refactored advanced_indexing to 1 host_task
ndgrigorian Mar 2, 2023
2446b00
Implements place, extract, nonzero kernels, and Python API for them
oleksandr-pavlyk Mar 3, 2023
c1f0081
Added missing include
oleksandr-pavlyk Mar 3, 2023
849a3ea
Hooked up boolean indexing, first attempt
oleksandr-pavlyk Mar 3, 2023
ed279d6
Changes per clang-format 11
oleksandr-pavlyk Mar 3, 2023
3ced89a
Used Strided1DCyclingIndexer in place implementations
oleksandr-pavlyk Mar 3, 2023
19691ca
Implemented dpctl.tensor.place as per documented behavior.
oleksandr-pavlyk Mar 3, 2023
03c4822
_take and _put returned event changes
ndgrigorian Mar 3, 2023
f75723b
Added tests to test_usm_ndarray_indexing
oleksandr-pavlyk Mar 3, 2023
cab0035
Fixed tests for boolean indexing
oleksandr-pavlyk Mar 3, 2023
9b94ea0
Merge pull request #1097 from IntelPython/boolean-indexing-extract-pl…
oleksandr-pavlyk Mar 3, 2023
cb32c6f
Tweaks to docstrings of extract, place, nonzero
oleksandr-pavlyk Mar 4, 2023
0a7ea0c
dpt.take and dpt.put changes
ndgrigorian Mar 4, 2023
13c5db7
Fixed rst in docstrings of extract/place
oleksandr-pavlyk Mar 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions .github/workflows/conda-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ jobs:
# echo "libintelocl.so" | tee /etc/OpenCL/vendors/intel-cpu.icd
export OCL_ICD_FILENAMES=libintelocl.so
# clinfo -l
python -m pytest -p no:faulthandler --pyargs $MODULE_NAME
python -m pytest --pyargs $MODULE_NAME

test_windows:
needs: build_windows
Expand Down Expand Up @@ -296,8 +296,10 @@ jobs:
conda activate dpctl_test && python -m dpctl -f
- name: Run tests
shell: cmd /C CALL {0}
env:
DPCTL_VERBOSITY: error
run: >-
conda activate dpctl_test && python -m pytest -p no:faulthandler --pyargs ${{ env.MODULE_NAME }}
conda activate dpctl_test && python -m pytest -v -s --pyargs ${{ env.MODULE_NAME }}

upload_linux:
needs: test_linux
Expand Down
12 changes: 4 additions & 8 deletions dpctl/apis/include/dpctl4pybind11.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1035,14 +1035,10 @@ sycl::event keep_args_alive(sycl::queue q,
shp_arr[i]->inc_ref();
}
cgh.host_task([=]() {
bool guard = (Py_IsInitialized() && !_Py_IsFinalizing());
if (guard) {
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
for (std::size_t i = 0; i < num; ++i) {
shp_arr[i]->dec_ref();
}
PyGILState_Release(gstate);
py::gil_scoped_acquire acquire;

for (std::size_t i = 0; i < num; ++i) {
shp_arr[i]->dec_ref();
}
});
});
Expand Down
2 changes: 2 additions & 0 deletions dpctl/tensor/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ pybind11_add_module(${python_module_name} MODULE
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/copy_numpy_ndarray_into_usm_ndarray.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/copy_for_reshape.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/linear_sequences.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/integer_advanced_indexing.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/boolean_advanced_indexing.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/eye_ctor.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/full_ctor.cpp
${CMAKE_CURRENT_SOURCE_DIR}/libtensor/source/triul_ctor.cpp
Expand Down
45 changes: 29 additions & 16 deletions dpctl/tensor/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@

"""

from numpy import dtype

from dpctl.tensor._copy_utils import asnumpy, astype, copy, from_numpy, to_numpy
from dpctl.tensor._ctors import (
arange,
Expand All @@ -41,8 +39,26 @@
zeros,
zeros_like,
)
from dpctl.tensor._data_types import (
bool,
complex64,
complex128,
dtype,
float16,
float32,
float64,
int8,
int16,
int32,
int64,
uint8,
uint16,
uint32,
uint64,
)
from dpctl.tensor._device import Device
from dpctl.tensor._dlpack import from_dlpack
from dpctl.tensor._indexing_functions import extract, nonzero, place, put, take
from dpctl.tensor._manipulation_functions import (
broadcast_arrays,
broadcast_to,
Expand All @@ -68,20 +84,7 @@
from dpctl.tensor._reshape import reshape
from dpctl.tensor._usmarray import usm_ndarray

bool = dtype("bool")
int8 = dtype("int8")
int16 = dtype("int16")
int32 = dtype("int32")
int64 = dtype("int64")
uint8 = dtype("uint8")
uint16 = dtype("uint16")
uint32 = dtype("uint32")
uint64 = dtype("uint64")
float16 = dtype("float16")
float32 = dtype("float32")
float64 = dtype("float64")
complex64 = dtype("complex64")
complex128 = dtype("complex128")
from ._constants import e, inf, nan, newaxis, pi

__all__ = [
"Device",
Expand Down Expand Up @@ -110,6 +113,11 @@
"expand_dims",
"permute_dims",
"squeeze",
"take",
"put",
"extract",
"place",
"nonzero",
"from_numpy",
"to_numpy",
"asnumpy",
Expand Down Expand Up @@ -141,4 +149,9 @@
"print_options",
"usm_ndarray_repr",
"usm_ndarray_str",
"newaxis",
"e",
"pi",
"nan",
"inf",
]
24 changes: 24 additions & 0 deletions dpctl/tensor/_constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Data Parallel Control (dpctl)
#
# Copyright 2020-2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import numpy as np

newaxis = None

pi = np.pi
e = np.e
nan = np.nan
inf = np.inf
229 changes: 229 additions & 0 deletions dpctl/tensor/_copy_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,16 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import operator

import numpy as np
from numpy.core.numeric import normalize_axis_index

import dpctl
import dpctl.memory as dpm
import dpctl.tensor as dpt
import dpctl.tensor._tensor_impl as ti
import dpctl.utils
from dpctl.tensor._device import normalize_queue_device

__doc__ = (
Expand Down Expand Up @@ -382,3 +387,227 @@ def astype(usm_ary, newdtype, order="K", casting="unsafe", copy=True):
)
_copy_from_usm_ndarray_to_usm_ndarray(R, usm_ary)
return R


def _extract_impl(ary, ary_mask, axis=0):
"""Extract elements of ary by applying mask starting from slot
dimension axis"""
if not isinstance(ary, dpt.usm_ndarray):
raise TypeError(
f"Expecting type dpctl.tensor.usm_ndarray, got {type(ary)}"
)
if not isinstance(ary_mask, dpt.usm_ndarray):
raise TypeError(
f"Expecting type dpctl.tensor.usm_ndarray, got {type(ary_mask)}"
)
exec_q = dpctl.utils.get_execution_queue(
(ary.sycl_queue, ary_mask.sycl_queue)
)
if exec_q is None:
raise dpctl.utils.ExecutionPlacementError(
"arrays have different associated queues. "
"Use `Y.to_device(X.device)` to migrate."
)
ary_nd = ary.ndim
pp = normalize_axis_index(operator.index(axis), ary_nd)
mask_nd = ary_mask.ndim
if pp < 0 or pp + mask_nd > ary_nd:
raise ValueError(
"Parameter p is inconsistent with input array dimensions"
)
mask_nelems = ary_mask.size
cumsum = dpt.empty(mask_nelems, dtype=dpt.int64, device=ary_mask.device)
exec_q = cumsum.sycl_queue
mask_count = ti.mask_positions(ary_mask, cumsum, sycl_queue=exec_q)
dst_shape = ary.shape[:pp] + (mask_count,) + ary.shape[pp + mask_nd :]
dst = dpt.empty(
dst_shape, dtype=ary.dtype, usm_type=ary.usm_type, device=ary.device
)
hev, _ = ti._extract(
src=ary,
cumsum=cumsum,
axis_start=pp,
axis_end=pp + mask_nd,
dst=dst,
sycl_queue=exec_q,
)
hev.wait()
return dst


def _nonzero_impl(ary):
if not isinstance(ary, dpt.usm_ndarray):
raise TypeError(
f"Expecting type dpctl.tensor.usm_ndarray, got {type(ary)}"
)
exec_q = ary.sycl_queue
usm_type = ary.usm_type
mask_nelems = ary.size
cumsum = dpt.empty(
mask_nelems, dtype=dpt.int64, sycl_queue=exec_q, order="C"
)
mask_count = ti.mask_positions(ary, cumsum, sycl_queue=exec_q)
indexes = dpt.empty(
(ary.ndim, mask_count),
dtype=cumsum.dtype,
usm_type=usm_type,
sycl_queue=exec_q,
order="C",
)
hev, _ = ti._nonzero(cumsum, indexes, ary.shape, exec_q)
res = tuple(indexes[i, :] for i in range(ary.ndim))
hev.wait()
return res


def _take_multi_index(ary, inds, p):
if not isinstance(ary, dpt.usm_ndarray):
raise TypeError
queues_ = [
ary.sycl_queue,
]
usm_types_ = [
ary.usm_type,
]
if not isinstance(inds, list) and not isinstance(inds, tuple):
inds = (inds,)
all_integers = True
for ind in inds:
queues_.append(ind.sycl_queue)
usm_types_.append(ind.usm_type)
if all_integers:
all_integers = ind.dtype.kind in "ui"
exec_q = dpctl.utils.get_execution_queue(queues_)
if exec_q is None:
raise dpctl.utils.ExecutionPlacementError("")
if not all_integers:
raise IndexError(
"arrays used as indices must be of integer (or boolean) type"
)
if len(inds) > 1:
inds = dpt.broadcast_arrays(*inds)
ary_ndim = ary.ndim
p = normalize_axis_index(operator.index(p), ary_ndim)

res_shape = ary.shape[:p] + inds[0].shape + ary.shape[p + len(inds) :]
res_usm_type = dpctl.utils.get_coerced_usm_type(usm_types_)
res = dpt.empty(
res_shape, dtype=ary.dtype, usm_type=res_usm_type, sycl_queue=exec_q
)

hev, _ = ti._take(
src=ary, ind=inds, dst=res, axis_start=p, mode=0, sycl_queue=exec_q
)
hev.wait()

return res


def _place_impl(ary, ary_mask, vals, axis=0):
"""Extract elements of ary by applying mask starting from slot
dimension axis"""
if not isinstance(ary, dpt.usm_ndarray):
raise TypeError(
f"Expecting type dpctl.tensor.usm_ndarray, got {type(ary)}"
)
if not isinstance(ary_mask, dpt.usm_ndarray):
raise TypeError(
f"Expecting type dpctl.tensor.usm_ndarray, got {type(ary_mask)}"
)
if not isinstance(vals, dpt.usm_ndarray):
raise TypeError(
f"Expecting type dpctl.tensor.usm_ndarray, got {type(ary_mask)}"
)
exec_q = dpctl.utils.get_execution_queue(
(ary.sycl_queue, ary_mask.sycl_queue, vals.sycl_queue)
)
if exec_q is None:
raise dpctl.utils.ExecutionPlacementError(
"arrays have different associated queues. "
"Use `Y.to_device(X.device)` to migrate."
)
ary_nd = ary.ndim
pp = normalize_axis_index(operator.index(axis), ary_nd)
mask_nd = ary_mask.ndim
if pp < 0 or pp + mask_nd > ary_nd:
raise ValueError(
"Parameter p is inconsistent with input array dimensions"
)
mask_nelems = ary_mask.size
cumsum = dpt.empty(mask_nelems, dtype=dpt.int64, device=ary_mask.device)
exec_q = cumsum.sycl_queue
mask_count = ti.mask_positions(ary_mask, cumsum, sycl_queue=exec_q)
expected_vals_shape = (
ary.shape[:pp] + (mask_count,) + ary.shape[pp + mask_nd :]
)
if vals.dtype == ary.dtype:
rhs = vals
else:
rhs = dpt.astype(vals, ary.dtype)
rhs = dpt.broadcast_to(rhs, expected_vals_shape)
hev, _ = ti._place(
dst=ary,
cumsum=cumsum,
axis_start=pp,
axis_end=pp + mask_nd,
rhs=rhs,
sycl_queue=exec_q,
)
hev.wait()
return


def _put_multi_index(ary, inds, p, vals):
if isinstance(vals, dpt.usm_ndarray):
queues_ = [ary.sycl_queue, vals.sycl_queue]
usm_types_ = [ary.usm_type, vals.usm_type]
else:
queues_ = [
ary.sycl_queue,
]
usm_types_ = [
ary.usm_type,
]
if not isinstance(inds, list) and not isinstance(inds, tuple):
inds = (inds,)
all_integers = True
for ind in inds:
if not isinstance(ind, dpt.usm_ndarray):
raise TypeError
queues_.append(ind.sycl_queue)
usm_types_.append(ind.usm_type)
if all_integers:
all_integers = ind.dtype.kind in "ui"
exec_q = dpctl.utils.get_execution_queue(queues_)
if exec_q is None:
raise dpctl.utils.ExecutionPlacementError(
"Can not automatically determine where to allocate the "
"result or performance execution. "
"Use `usm_ndarray.to_device` method to migrate data to "
"be associated with the same queue."
)
if not all_integers:
raise IndexError(
"arrays used as indices must be of integer (or boolean) type"
)
if len(inds) > 1:
inds = dpt.broadcast_arrays(*inds)
ary_ndim = ary.ndim

p = normalize_axis_index(operator.index(p), ary_ndim)
vals_shape = ary.shape[:p] + inds[0].shape + ary.shape[p + len(inds) :]

vals_usm_type = dpctl.utils.get_coerced_usm_type(usm_types_)
if not isinstance(vals, dpt.usm_ndarray):
vals = dpt.asarray(
vals, ary.dtype, usm_type=vals_usm_type, sycl_queue=exec_q
)

vals = dpt.broadcast_to(vals, vals_shape)

hev, _ = ti._put(
dst=ary, ind=inds, val=vals, axis_start=p, mode=0, sycl_queue=exec_q
)
hev.wait()

return
Loading