-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lock down kwargs in offsets signatures #17458
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17458 +/- ##
==========================================
- Coverage 91.15% 91.14% -0.02%
==========================================
Files 163 163
Lines 49591 49588 -3
==========================================
- Hits 45207 45195 -12
- Misses 4384 4393 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17458 +/- ##
==========================================
- Coverage 91.26% 91.24% -0.03%
==========================================
Files 163 163
Lines 49872 49978 +106
==========================================
+ Hits 45514 45600 +86
- Misses 4358 4378 +20
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a whatsnew note in apbreaking (maybe a sub-section), where you note which thinks are changing, and how it could potentitally affect the user. In future PR's can expand this as things are added.
doc/source/whatsnew/v0.21.0.txt
Outdated
Restricting DateOffset Keywords | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, subclasses :class:`Week`, :class:`WeekOfMonth`, :class:`LastWeekOfMonth`, :class:`QuarterOffset`, :class:`QuarterEnd`, :class:`QuarterBegin`, :class:`BQuarterEnd`, and :class:`BQuarterBegin`, will allow only specific arguments. (:issue:`17176`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't Month/MonthEnd included here? why not all classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't Month/MonthEnd included here?
Not as of now, no.
why not all classes?
Wanted to change them a couple at a time, err on the side of doing too little. Now that #17486 is resolved I can push forward on a few more. You want to just get them all in one go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes for this type of change am ok with doing it in one fell swoop. On the whatsnew just highlite the change generally (e.g. give a single example), then list the classes that are affected (or if its all, then just say all). imagine a user who just 'uses' pandas. They want to quickly get to: does this change affect me, if so, what do I need to do to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in-line comment
ok, rebase this and i'll look again. |
…fset_sigs Lock down kwargs in Week.__init__
Hello @jbrockmendel! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on October 06, 2017 at 03:52 Hours UTC |
OK, as requested, the new commits cover all of the existing classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Feel free to merge @jreback
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -387,6 +387,13 @@ New Behavior | |||
|
|||
Additionally, DataFrames with datetime columns that were parsed by :func:`read_sql_table` and :func:`read_sql_query` will also be localized to UTC only if the original SQL columns were timezone aware datetime columns. | |||
|
|||
.. _whatsnew_0210.api.restricting_dateoffset_keywords | |||
|
|||
Restricting DateOffset Keywords |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, since we made this simpler, move to just a bullet point in Other API changes section.
@@ -1121,7 +1125,7 @@ class SemiMonthOffset(DateOffset): | |||
_default_day_of_month = 15 | |||
_min_day_of_month = 2 | |||
|
|||
def __init__(self, n=1, day_of_month=None, normalize=False, **kwds): | |||
def __init__(self, n=1, normalize=False, day_of_month=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this consistent with other freq signatures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other signatures are (self, n=1, normalize=False, ...
. In the status quo this is an outlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that's fine. see my comment below though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, in an ideal world we would change it but can't now
""" | ||
|
||
def __init__(self, n=1, normalize=False, **kwds): | ||
def __init__(self, n=1, normalize=False, weekday=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a test for this as weekday=None? isn't that an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an error, though a different one in Py2 vs Py3. In the status quo not passing a weekday explicitly will raise a KeyError. Default to Monday I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no if weekday is None, just raise a value error, in reality this should actually be the first (and a required parameter), but I guess that would be hard to change. Don't set defaults, raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting it should be __init__(self, weekday, n=1, normalize=False)
? I see some appeal to that, but there is a lot of symmetry with other signatures all starting with (self, n=1, normalize=False, ...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comment above. this is ok for now. actually weekday is optional, so this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things. First, here and in other places you say "this is fine" but I don't know whether you're referring to the PR or the thing the PR is fixing. Based on the go-ahead you've given below to look over timeseries.rst I'm assuming you mean the PR version is fine, just heads up for ways I'm liable to get confused next time around.
actually weekday is optional,
Under the status quo, failing to pass weekday
raises a KeyError
. In the PR, not passing anything will raise a ValueError in py2 and TypeError in py3. (The same errors would result in the status quo if a user passed None
explicitly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never mean this entire PR is fine. If that was the case I wouldn't have any other comments. Individual comments are revering to the specific line/section where I make it.
@@ -1829,13 +1835,14 @@ class QuarterOffset(DateOffset): | |||
# TODO: Consider combining QuarterOffset and YearOffset __init__ at some | |||
# point | |||
|
|||
def __init__(self, n=1, normalize=False, **kwds): | |||
def __init__(self, n=1, normalize=False, startingMonth=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make an issue that this should be deprecated to starting_month
(and prob others like this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine, can you pass thru the timeseries.rst docs and if anything needs changing there (should be ok), but who knows
S'YearBegin' | ||
p1 | ||
ccopy_reg | ||
_reconstructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not what i asked for at all
pls use the generate_legacy_data script
we have removed all of these ad hoc pickle things
we r not adding them back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You asked me to run a script that doesn't work, then to troubleshoot it until it does. I found existing tests and followed their example. Please cut me some darn slack.
we have removed all of these ad hoc pickle things
test_pickle_v0_15_2
and test_pickle_compat_0_14_1
might lead a naive reader (such as myself) to suspect otherwise.
you can use pandas versions to do that ONLY in that script to do that.
This sentence confuses me.
If you make changes [to the generation script] then they should be checked in as part of this PR.
I have no interest in making changes to the non-functional generation script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no interest in making changes to the non-functional generation script.
well it works fine according to the instructions I gave and has been used for quite a while.
test_pickle_v0_15_2 and test_pickle_compat_0_14_1 might lead a naive reader
well if you look at what exists, the vast majority are in an organized format. The remaining ones should actually be converted to this format.
(pandas) bash-3.2$ find pandas -name '*pickle*'
pandas/compat/__pycache__/pickle_compat.cpython-36.pyc
pandas/compat/pickle_compat.py
pandas/io/__pycache__/pickle.cpython-36.pyc
pandas/io/pickle.py
pandas/tests/indexes/data/mindex_073.pickle
pandas/tests/indexes/data/multiindex_v1.pickle
pandas/tests/io/__pycache__/test_pickle.cpython-36-PYTEST.pyc
pandas/tests/io/data/categorical_0_14_1.pickle
pandas/tests/io/data/categorical_0_15_2.pickle
pandas/tests/io/data/legacy_pickle
pandas/tests/io/data/legacy_pickle/0.10.1/AMD64_windows_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.10.1/x86_64_linux_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.11.0/0.11.0_x86_64_linux_3.3.0.pickle
pandas/tests/io/data/legacy_pickle/0.11.0/x86_64_linux_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.11.0/x86_64_linux_3.3.0.pickle
pandas/tests/io/data/legacy_pickle/0.12.0/0.12.0_AMD64_windows_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.12.0/0.12.0_x86_64_linux_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_AMD64_windows_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_i686_linux_2.6.5.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_i686_linux_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_i686_linux_3.2.3.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_x86_64_darwin_2.7.5.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_x86_64_darwin_2.7.6.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_x86_64_linux_2.7.3.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_x86_64_linux_2.7.8.pickle
pandas/tests/io/data/legacy_pickle/0.13.0/0.13.0_x86_64_linux_3.3.0.pickle
pandas/tests/io/data/legacy_pickle/0.14.0/0.14.0_x86_64_darwin_2.7.6.pickle
pandas/tests/io/data/legacy_pickle/0.14.0/0.14.0_x86_64_linux_2.7.8.pickle
pandas/tests/io/data/legacy_pickle/0.14.1/0.14.1_x86_64_darwin_2.7.12.pickle
pandas/tests/io/data/legacy_pickle/0.14.1/0.14.1_x86_64_linux_2.7.8.pickle
pandas/tests/io/data/legacy_pickle/0.15.0/0.15.0_x86_64_darwin_2.7.12.pickle
pandas/tests/io/data/legacy_pickle/0.15.0/0.15.0_x86_64_linux_2.7.8.pickle
pandas/tests/io/data/legacy_pickle/0.15.2/0.15.2_x86_64_darwin_2.7.9.pickle
pandas/tests/io/data/legacy_pickle/0.16.0/0.16.0_x86_64_darwin_2.7.9.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_AMD64_windows_2.7.10.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_AMD64_windows_3.4.3.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_x86_64_darwin_2.7.10.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_x86_64_darwin_2.7.9.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_x86_64_darwin_3.4.3.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_x86_64_linux_2.7.10.pickle
pandas/tests/io/data/legacy_pickle/0.16.2/0.16.2_x86_64_linux_3.4.3.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.0_AMD64_windows_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.0_AMD64_windows_3.4.4.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.0_x86_64_darwin_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.0_x86_64_darwin_3.4.4.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.0_x86_64_linux_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.0_x86_64_linux_3.4.4.pickle
pandas/tests/io/data/legacy_pickle/0.17.0/0.17.1_AMD64_windows_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.17.1/0.17.1_AMD64_windows_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.17.1/0.17.1_x86_64_darwin_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.18.0/0.18.0_AMD64_windows_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.18.0/0.18.0_AMD64_windows_3.5.1.pickle
pandas/tests/io/data/legacy_pickle/0.18.0/0.18.0_x86_64_darwin_2.7.11.pickle
pandas/tests/io/data/legacy_pickle/0.18.0/0.18.0_x86_64_darwin_3.5.1.pickle
pandas/tests/io/data/legacy_pickle/0.18.1/0.18.1_x86_64_darwin_2.7.12.pickle
pandas/tests/io/data/legacy_pickle/0.18.1/0.18.1_x86_64_darwin_3.5.2.pickle
pandas/tests/io/data/legacy_pickle/0.19.2/0.19.2_x86_64_darwin_2.7.12.pickle
pandas/tests/io/data/legacy_pickle/0.19.2/0.19.2_x86_64_darwin_3.6.1.pickle
pandas/tests/io/test_pickle.py
pandas/tests/tseries/data/cday-0.14.1.pickle
pandas/tests/tseries/data/dateoffset_0_15_2.pickle
As you know pandas is a very large library. If things do not have process & procedure around them, then future generations will have pain. We have worked pretty hard (and you are contributing to this), to reduce technical debt.
see #17733 |
@jbrockmendel |
I think I covered this case |
Thanks for fixing the script. Added a file for 0.20.3 for kicks. So since
There's some subject-verb ambiguity here. What did you cover/run? What can be removed?
So go into |
no you can just generate a new one. We only compare to things that exist in the pickle AND in the generate script. So as long as you simply 'add' things its ok (e.g. don't rename or delete). I added an offset section, you can simply add moar things to it (or even another version of it that say initializes with keyword args or somesuch). |
OK, so I generated a new file by running the script under 0.20.3. Then... what? The existing pickle tests in tests.tseries.test_offsets are precedent you've explicitly said not to follow. |
pandas/tests/tseries/test_offsets.py
Outdated
@@ -502,6 +502,31 @@ def test_pickle_v0_15_2(self): | |||
# | |||
tm.assert_dict_equal(offsets, read_pickle(pickle_path)) | |||
|
|||
def test_pickle_v0_20_3(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
none of this is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these go in the generate_legacy_storage_files script instead. (which you then use to generate). I think the instructions are now pretty clear. If they are not, feel free to update them as well (at the top of that file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, just removed that test. As to the instructions, I'm still unclear on where the actual test goes and what it looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test is generic it runs ALL of the pickle files
that’s the point you don’t actually need to construct a test just put a new pickle file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. test_pickle_v0_20_3
has been removed, so it sounds like we're good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you make changes in generate_legacy_storage_files? or just generate 0.20.3? I wanted you to update that with passing keywords so we can cover that as well, e.g.
ideally we would cover kwargs for most of the offsets.
In [3]: offsets.SemiMonthBegin(1, normalize=True, day_of_month=4)
Out[3]: <SemiMonthBegin: day_of_month=4>
This then assures that we have backcompat.
Just added a handful of new examples to
|
also seems we should update the doc-strings for the offsets now that the signature are ok, xref #8435 |
This is because we have a top-level file called |
ok this looks fine. Can you update the doc-strings? (also can do in another PR). now that the keywords are there they should be documented. lmk |
This PR has been exhausting. I'll put this on the TODO list. |
@jbrockmendel hahah, ok thanks1 |
Explicitly specify allowed kwargs in
__init__
forWeek
,WeekOfMonth
,LastWeekOfMonth
andQuarterOffset
.Ref #17176
Intended to be orthogonal/complementary to #17450.
Will add tests after confirming we're on the same page.
git diff upstream/master -u -- "*.py" | flake8 --diff