Skip to content

REF/TST: Fix remaining DatetimeArray with DateOffset arithmetic ops #23789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Nov 28, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
986fdbc
unrelated, change _window->libwindow
jbrockmendel Nov 19, 2018
fd75931
revert non-central
jbrockmendel Nov 19, 2018
66c866b
implement remaining methods to fix dateoffset arithmetic with DTA/TDA
jbrockmendel Nov 19, 2018
4dc17e2
xfail dataframe case
jbrockmendel Nov 19, 2018
da3459c
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 20, 2018
348a8b2
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 20, 2018
dd7e873
dont use verify_integrity, push one more level of test into parametrize
jbrockmendel Nov 20, 2018
c8351bc
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 20, 2018
23a25d1
fix broken tests
jbrockmendel Nov 20, 2018
b4ae288
comment clarification
jbrockmendel Nov 22, 2018
d1ebdbf
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 22, 2018
711ee61
dummy commit to force CI
jbrockmendel Nov 22, 2018
9338b5b
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 25, 2018
5fbe9c8
comments about cached and non-cached implementations
jbrockmendel Nov 25, 2018
c7db0e4
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 26, 2018
a4f9733
coverage
jbrockmendel Nov 26, 2018
b50fedf
special casing in freq_infer
jbrockmendel Nov 26, 2018
7e951e4
special casing followup
jbrockmendel Nov 26, 2018
317e1e7
fix simple_new args
jbrockmendel Nov 26, 2018
5de2d42
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 27, 2018
5433a71
privatize
jbrockmendel Nov 27, 2018
dc137f3
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 27, 2018
2c65f3b
more kludges
jbrockmendel Nov 27, 2018
31c5c0b
Merge branch 'master' of https://github.com/pandas-dev/pandas into et…
jbrockmendel Nov 27, 2018
c3d775e
typo fixup
jbrockmendel Nov 27, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
comments about cached and non-cached implementations
  • Loading branch information
jbrockmendel committed Nov 25, 2018
commit 5fbe9c8483b7730d4882da10091d47427f11fd8c
13 changes: 7 additions & 6 deletions pandas/core/arrays/timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,9 +176,6 @@ def __new__(cls, values, freq=None, dtype=_TD_DTYPE, copy=False):
.format(inferred=inferred_freq,
passed=freq.freqstr))
elif freq is None:
# TODO: should this be the stronger condition `if freq_infer`?
# i.e what if the user passed `freq=None` and specifically
# wanted freq=None in the result?
freq = inferred_freq
freq_infer = False

Expand Down Expand Up @@ -244,15 +241,19 @@ def _validate_fill_value(self, fill_value):
"Got '{got}'.".format(got=fill_value))
return fill_value

@property
# is_monotonic_increasing, is_monotonic_decreasing, and is_unique
# are needed by `frequencies.infer_freq`, which is called when accessing
# the `inferred_freq` property inside the TimedeltaArray constructor

@property # NB: override with cache_readonly in immutable subclasses
def is_monotonic_increasing(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When calling the constructor with freq="infer" we end up calling tseries.frequencies.infer_freq, which accesses this property

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I already expressed this previously, but I would prefer that we have this discussion for all the Arrays, not just for TimedeltaArray (or DatetimeArray), as this is not a datetime-specific attribute. If you don't want to have that discussion first, you can always make it a private attribute here, and check that in infer_freq (or call the algos methods there if the passed values don't have such an attribute)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a discussion you'd like to have for EAs more generally, go ahead and open an issue for it.

I would be +1 on putting these attributes in the mixin class so that they are available on all three of DTA/TDA/PA, but for now they are only needed on TDA.

Special-casing inside infer_freq (or more specifically FrequencyInferer is a pretty ugly hack that I'd rather avoid. That said, if you're going to insist on it, I might as well get it over with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel I would really prefer if you leave this out of this PR (I mean adding the attributes, so which means special casing this in infer_freq).

Looking at the code again, I think the easiest to do is to ensure what is passed to _FrequencyInferrer is an actual index. So if we have an array, simply convert to Index without copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

converting to an Index would break the "Array should be ignorant of Index" rule discussed elsewhere.

I'll special-case this to get this over with, but I maintain this is introducing a code smell.

return algos.is_monotonic(self.asi8, timelike=True)[0]

@property
@property # NB: override with cache_readonly in immutable subclasses
def is_monotonic_decreasing(self):
return algos.is_monotonic(self.asi8, timelike=True)[1]

@property
@property # NB: override with cache_readonly in immutable subclasses
def is_unique(self):
return len(unique1d(self.asi8)) == len(self)

Expand Down
2 changes: 2 additions & 0 deletions pandas/core/indexes/timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,8 @@ def _format_native_types(self, na_rep=u'NaT', date_format=None, **kwargs):
# -------------------------------------------------------------------
# Wrapping TimedeltaArray

# override non-caching implementations from TimedeltaArray with
# _engine-based implementations that take advantage of Index immutability
is_monotonic_increasing = Index.is_monotonic_increasing
is_monotonic_decreasing = Index.is_monotonic_decreasing
is_unique = Index.is_unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to time this, we have a dedicated routine in the cython engine for this, the main reason for using it is it a O(n) op once the hashtable is computed, rather than a non-cached computation which you did above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. TimedeltaIndex uses the _engine-based implementation that is available because we know TDI is immutable. TimedeltaArray uses the naive implementation, at least for now. There's an Issue to investigate caching on PeriodArray which can be extended to TDA/DTA when the time comes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not obvious at all. i would expect a comment on this. maybe even do this in a separate PR, with testing for this. I am not sure its relevant to the changes in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment. It is definitely relevant to this PR, since without implementing these in the TDA calls to infer_freq (via __new__) raise AttributeError

Expand Down