Description
What happened?
The documentation of both the DataArray and Dataset method polyfit
state for the first parameter:
dim : hashable
Coordinate along which to fit the polynomials.
But looking at the code:
x = get_clean_interp_index(self, dim, strict=False)
For which the documentation states
Return index to use for x values in interpolation or curve fitting.
Parameters
----------
arr : DataArray
Array to interpolate or fit to a curve.
dim : str
Name of dimension along which to fit.
use_coordinate : str or bool
If use_coordinate is True, the coordinate that shares the name of the
dimension along which interpolation is being performed will be used as the
x values. If False, the x values are set as an equally spaced sequence.
Therefore, the x used for fitting here is not the coordinate, but the coordinate that shares the same name as a dimension.
To give an example why this is a problem:
I want to make a fit on a time series. The data has the dimension "time" for which there is a coordinate "time" which consists of datetime64 objects. However, I want to control the units of the fit parameters and have another coordinate called "time_in_days" associated with the "time" dimension, which contains the number of days since a specific date as integers.
A nice and easy fix would be to actually let dim
be the label of a coordinate (contrary to what the name suggests) and get the associated dimension to know along which axis to perform the fit.
This would not change the behavior of any existing code, but it would just allow for more versatility.
The curvefit function correctly uses coords
as a first parameter, which makes a lot more sense. But changing the name of the parameter would likely change or break some of the existing code which uses polyfit
.
What did you expect to happen?
No response
Minimal Complete Verifiable Example
No response
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
No response
Anything else we need to know?
No response
Environment
irrelevant