Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more frequency string updates? #8612

Closed
mathause opened this issue Jan 16, 2024 · 5 comments · Fixed by #8627
Closed

more frequency string updates? #8612

mathause opened this issue Jan 16, 2024 · 5 comments · Fixed by #8627

Comments

@mathause
Copy link
Collaborator

What is your issue?

I looked a bit into the frequency string update & found 3 issues we could improve upon.

  1. Apart from "M", pandas also deprecated "Y", and "Q", in favor of "YE" and "QE". (And they are discussing renaming "MS" to "MB"). Should we do the same?

  2. Should we translate the new freq strings to the old ones if pandas < 2.2 is installed? Otherwise we get the following situation:

    import xarray as xr
    xr.date_range("1600-02-01", periods=3, freq="M") # deprecation warning
    xr.date_range("1600-02-01", periods=3, freq="ME") # ValueError: Invalid frequency: ME
  3. date_range_like can emit deprecation warnings without a way to mitigate them if pandas < 2.2 is installed. (When a DatetimeIndex) is passed. Could be nice to translate the old freq string to the new one without a warning.

I have played around with 2. and 3. and can open a PR if you are on board.

@spencerkclark @aulemahal

@mathause mathause added needs triage Issue that has not been reviewed by xarray team member API design topic-pandas-like topic-cftime needs discussion and removed needs triage Issue that has not been reviewed by xarray team member labels Jan 16, 2024
@aulemahal
Copy link
Contributor

  1. I think we should follow pandas closely, so it is a yes! It seems Q is already changed to QE in xarray, but YE is missing indeed.
  2. Indeed an automatic solution would be useful. For example, I currently pinned my envs not to use xarray >= 2023.11.0 because of that issue. I have these strings hardcoded everywhere...
  3. Yes, good idea!

@max-sixty
Copy link
Collaborator

  • Should we translate the new freq strings to the old ones if pandas < 2.2 is installed? Otherwise we get the following situation:
    import xarray as xr
    xr.date_range("1600-02-01", periods=3, freq="M") # deprecation warning
    xr.date_range("1600-02-01", periods=3, freq="ME") # ValueError: Invalid frequency: ME

No strong view, but if we're just more permissive — forwarding almost anything to pandas — this gets around the issue. And doesn't involve writing translation code that will be removed fairly soon.

@spencerkclark
Copy link
Member

If I understand correctly, the issue @mathause is noting is a bit more subtle. We already do forward to pandas by default for the standard calendar, but in this case pandas will raise an OutOfBoundsDatetime exception due to the start date being out range for nanosecond-precision datetimes. We catch this and then fall back to using cftime_range, which is implemented in xarray. For pandas < 2.2, passing "ME" as a frequency string will raise a ValueError, which short circuits this fallback logic, so in this narrow case it is not possible to address the deprecation warning emitted by xarray.

I think I am open to this, especially since we already introduced some translation logic as part of #8394; this refines that a bit more. It is nice though that pandas 2.2 was just released yesterday, which makes this less of an issue.

@max-sixty
Copy link
Collaborator

but in this case pandas will raise an OutOfBoundsDatetime exception due to the start date being out range for nanosecond-precision datetimes. We catch this and then fall back to using cftime_range, which is implemented in xarray.

👍 , thanks for the clear explanation


Could I ask how close we are to fixing the other string issues — e.g. M / ME? Those are the ones that break the doctests at the moment.

Should we ignore them in the doctests in the meantime? Or does changing the doctests to use the new string solve it?

@spencerkclark
Copy link
Member

spencerkclark commented Jan 22, 2024

Indeed changing the doctests to use the new string should solve it (fortunately we don't have to worry about supporting multiple pandas versions there). I think this is already taken care of in #8638 (modulo including the updates in the generation file too).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants