Skip to content

Implement polyfit? #3349

Closed
Closed
@rabernat

Description

@rabernat

Fitting a line (or curve) to data along a specified axis is a long-standing need of xarray users. There are many blog posts and SO questions about how to do it:

The main use case in my domain is finding the temporal trend on a 3D variable (e.g. temperature in time, lon, lat).

Yes, you can do it with apply_ufunc, but apply_ufunc is inaccessibly complex for many users. Much of our existing API could be removed and replaced with apply_ufunc calls, but that doesn't mean we should do it.

I am proposing we add a Dataarray method called polyfit. It would work like this:

x_ = np.linspace(0, 1, 10)
y_ = np.arange(5)
a_ = np.cos(y_)

x = xr.DataArray(x_, dims=['x'], coords={'x': x_})
a = xr.DataArray(a_, dims=['y'])
f = a*x
p = f.polyfit(dim='x', deg=1)

# equivalent numpy code
p_ = np.polyfit(x_, f.values.transpose(), 1)
np.testing.assert_allclose(p_[0], a_)

Numpy's polyfit function is already vectorized in the sense that it accepts 1D x and 2D y, performing the fit independently over each column of y. To extend this to ND, we would just need to reshape the data going in and out of the function. We do this already in other packages. For dask, we could simply require that the dimension over which the fit is calculated be contiguous, and then call map_blocks.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions