[FEA] Add an internal utility API to return an offsets column of a sliced column starting with zero #9256
Open
Description
opened on Sep 20, 2021
For nested sliced columns that have an offsets column child, their offsets columns may contain values that do not start from zero. For example:
offsets = [5, 7, 20, ...]
Many operations on these sliced columns need to generate an output offsets column that starts with zero. For example, with the input column having offsets given above, the output offsets column should be:
offsets = [0, 2, 13, ...]
Such output offsets column is generated simply by subtracting all the values with the first value. Yes, very simple.
I would like to have an internal API implementing this feature. Currently, there are several other APIs using it by implementing private code in their .cu
files. For example:
lists/segmented_sort.cu
lists/drop_list_duplicates
(FYI: Add struct type support fordrop_list_duplicates
#9202 (comment))
Activity