-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: move range-generation functions to EA mixin classes #22016
Conversation
Hello @jbrockmendel! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 25, 2018 at 19:12 Hours UTC |
start = start.tz_localize(tz) | ||
end = end.tz_localize(tz) | ||
arr = np.linspace(start.value, end.value, periods) | ||
index = cls._simple_new(arr.astype('M8[ns]'), freq=None, tz=tz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of two things changed from the existing version. ATM this is:
index = tools.to_datetime(np.linspace(start.value,
end.value, periods),
utc=True)
index = index.tz_convert(tz)
Changed to avoid the dependency on tools
.
pandas/core/arrays/datetimes.py
Outdated
dates = list(xdr) | ||
# utc = len(dates) > 0 and dates[0].tzinfo is not None | ||
values = np.array([x.value for x in dates]) | ||
data = cls._simple_new(values, freq=freq, tz=tz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the second change from the existing version. ATM this reads:
# utc = len(dates) > 0 and dates[0].tzinfo is not None
data = tools.to_datetime(dates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can go ahead and blow away the commented line of code above and just directly use xdr
:
values = np.array([x.value for x in xdr])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will do. I'd like to track down the source of the comment and make sure it isn't something important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm this list conversion is problematic
Codecov Report
@@ Coverage Diff @@
## master #22016 +/- ##
==========================================
+ Coverage 92.02% 92.03% +<.01%
==========================================
Files 170 170
Lines 50707 50743 +36
==========================================
+ Hits 46661 46699 +38
+ Misses 4046 4044 -2
Continue to review full report at Codecov.
|
pandas/core/arrays/datetimelike.py
Outdated
""" | ||
if is_period_dtype(cls): | ||
# Frequency validation is not meaningful for Period Array/Index | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be explicit about the return
pandas/core/arrays/datetimelike.py
Outdated
raise TypeError("cannot add {dtype}-dtype to {cls}" | ||
.format(dtype=other.dtype, | ||
cls=type(self).__name__)) | ||
elif is_categorical_dtype(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better is is_extension_array_dtype
pandas/core/arrays/datetimelike.py
Outdated
raise TypeError("cannot subtract {dtype}-dtype from {cls}" | ||
.format(dtype=other.dtype, | ||
cls=type(self).__name__)) | ||
elif is_categorical_dtype(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
pandas/core/arrays/datetimes.py
Outdated
dates = list(xdr) | ||
# utc = len(dates) > 0 and dates[0].tzinfo is not None | ||
values = np.array([x.value for x in dates]) | ||
data = cls._simple_new(values, freq=freq, tz=tz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm this list conversion is problematic
pandas/core/arrays/datetimes.py
Outdated
|
||
|
||
def _infer_tz_from_endpoints(start, end, tz): | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for sure add doc-strings in future
Why? |
you are turning an array into a list, only to turn it back to an array |
It's the same in the existing implementation; we can open an issue if there's a real perf impact. |
ok if that's true, then this is prob an overblow concern. pls give a quick test to see if this is actually an issue. rebase in an y event. |
@@ -754,44 +593,6 @@ def isin(self, values): | |||
|
|||
return algorithms.isin(self.asi8, values.asi8) | |||
|
|||
def shift(self, n, freq=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep a stub calling the super method here as well? (as you did with the arithmetic methods or with _generate_range
)
I know it is not strictly needed here, but I have the feeling that will make it more clear that then in the future we need to update this from calling super class method to calling the composited values method (although I suppose we have tests for it that will also be a reminder to do it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For _generate_range
and a few others it is needed because the Index method needs to set result.name
.
There are a handful of other things things in the same situation as shift (e.g. several properties that are overriden by cache_readonlys) that will need to be edited if/when the change from subclassing to composition is made. I'll make a note not to forget shift
when the time comes.
thanks @jbrockmendel, pls follow up with @jorisvandenbossche comments in next PR |
@TomAugspurger I think with this PeriodArrayMixin has everything needed for comparisons+arithmetic, so PeriodBlock should be viable. |
@jbrockmendel you are planning to do a PR for that? |
IIRC Tom wanted to be informed when the arith/compare parts were ready so he could forge ahead with PeriodBlock. My current focus is on fixing DataFrame/Series ops inconsistencies by having DataFrame dispatch to Series (#22068, #22019) and getting thorough tests for these new EA subclasses (#22033) |
Thanks.
I'm hoping to knock out SparseArray this week or, so if anyone else wants
to do Period then by all means.
Otherwise I'll start on it after sparse.
…On Thu, Jul 26, 2018 at 3:55 PM jbrockmendel ***@***.***> wrote:
you are planning to do a PR for that?
IIRC Tom wanted to be informed when the arith/compare parts were ready so
he could forge ahead with PeriodBlock.
My current focus is on fixing DataFrame/Series ops inconsistencies by
having DataFrame dispatch to Series (#22068
<#22068>, #22019
<#22019>) and getting thorough
tests for these new EA subclasses (#22033
<#22033>)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#22016 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIsw1L1ZtLtEcms0BN-7aHudIiGqrks5uKiy0gaJpZM4VZwvc>
.
|
@jbrockmendel since you are now very familiar with code, I would say, certainly go ahead (if you have time of course) |
With the DatetimeArray range-generating functions moved, we are finally able to move
shift
, and in turn__add__
,__sub__
, etc.Two non-trivial changes made during the move process, see inline comments.
Upcoming commits will add docstrings and port tests. ATM many tests are a PITA bc the Array contructors don't know how to handle lists of strings. Trying to find a way to implement that in
__new__
without having it become a total mess.