Skip to content

perf improvement for interp: set assume_sorted automatically #9758

Open
@dcherian

Description

@dcherian

What is your issue?

assume_sorted is False, so for vectorized interpolation across multiple dimensions, we end up lexsorting the coordinates all the time. For some reason, this can be quite slow with dask.

obj = self if assume_sorted else self.sortby(list(coords))

Instead we should be able to do

obj = self
# sort by slicing if we can
for coord in set(indexers) and set(self._indexes):
    # TODO: better check for PandasIndex
    if self.indexes[coord].is_monotonic_decreasing:
        obj = obj.isel(coord: slice(None, None, -1))

# TODO: make None the new default
if assume_sorted is None:
    # TODO: dims without coordinates are fine too
    assume_sorted = all(self.indexes[coord].is_monotonic_increasing for coord in indexers)

I'll add a reproducible example later, but the problem I've been playing gets much faster for graph construction:
image

xref #6799

cc @mpiannucci @Illviljan

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions