Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantile: rename interpolation arg to method #6108

Merged
merged 18 commits into from
Feb 7, 2022

Conversation

mathause
Copy link
Collaborator

@mathause mathause commented Dec 25, 2021

numpy/numpy#20327 introduces some changes to np.quantile (and related) for the upcoming numpy release (v1.22.0). It renames the interpolation keyword to method and offers some new interpolation methods. This PR does two things

  1. it restores compatibility with numpy 1.22
  2. it renames the interpolation keyword to method in xarray - this change is not strictly necessary but I thought better to be consistent with numpy
  • Tests added
  • Passes pre-commit run --all-files
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

(Side note in dask.array.percentile the method keyword is used differently from the interpolation keyword (docs). However, xarray does not use the dask function.)


TODO: need to import ArrayLike from npcompat.

xarray/core/dataset.py Outdated Show resolved Hide resolved
xarray/core/dataset.py Outdated Show resolved Hide resolved
xarray/core/variable.py Outdated Show resolved Hide resolved
@Illviljan
Copy link
Contributor

Illviljan commented Jan 6, 2022

dask.array.percentile should use the correct numpy keywords as well now, see dask/dask#8525.

Copy link
Collaborator Author

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready for review.

xarray/core/dataarray.py Show resolved Hide resolved
- nearest: ``i`` or ``j``, whichever is nearest.
- midpoint: ``(i + j) / 2``.
use when the desired quantile lies between two data points.
See numpy.quantile for available methods.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ok or should we list them explicitly? (The available methods depend on the numpy version which makes this a bit difficult).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would lean toward listing the current available methods but if not, we should make sure numpy.quantile get's picked up as a link by intersphinx.

@jhamman
Copy link
Member

jhamman commented Jan 19, 2022

+1 on making this change. However, just to note, Pandas seems to still have the the interpolation argument (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.quantile.html) so we will be introducing an inconsistency there.

interpolation: str = "linear",
q: ArrayLike,
dim: str | Sequence[Hashable] | None = None,
method: str = "linear",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use Literal["linear", etc] here instead of str.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that's a bit over the top - can I get away without it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it would be cool, definitely fine without out too! :)

@max-sixty max-sixty added plan to merge Final call for comments and removed needs review labels Jan 31, 2022
@mathause
Copy link
Collaborator Author

mathause commented Feb 7, 2022

The rtd failure looks unrelated.

@mathause mathause merged commit d47cf0c into pydata:main Feb 7, 2022
@mathause mathause deleted the quantile_interpolation_method branch February 7, 2022 09:40
@huard huard mentioned this pull request Feb 8, 2022
4 tasks
8. "median_unbiased" (*)
9. "normal_unbiased" (*)

The first three methods are discontiuous. The following discontinuous
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in "discontiuous"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants