Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Make repeat method consistent #24395

Merged
merged 3 commits into from
Dec 23, 2018

Conversation

jschendel
Copy link
Member

Follow-up to changes made in #24349. Basically trying to make the repeat method as consistent as possible across the various implementations within the codebase.

Summary:

  • Made the function signature consistent: repeat(self, repeats, *args, **kwargs)
    • I think this could be restricted to repeat(self, repeats, axis=None) but I opted to match on the least restrictive function signature instead.
  • Made the docstrings consistent and shared where possible.
  • Removed unnecessary implementations where the base implementation works
    • Mostly marked with TODO's to remove, so seemed safe, and tests pass
  • Cleaned up related tests:
    • Condensed tests where possible
    • Expanded some test cases

cc @TomAugspurger

@jschendel jschendel added Testing pandas testing functions or related to the test suite API Design Clean labels Dec 22, 2018
@jschendel jschendel added this to the 0.24.0 milestone Dec 22, 2018
@pep8speaks
Copy link

pep8speaks commented Dec 22, 2018

Hello @jschendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on December 23, 2018 at 21:02 Hours UTC


Parameters
----------
repeats : int
Copy link
Member Author

@jschendel jschendel Dec 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that repeat can be an array of ints in the numpy implementation, and this appears to work for the pandas implementation as well. Added tests for this behavior in places it didn't previously exist.

In [2]: np.array([10, 20, 30]).repeat([1, 2, 3])
Out[2]: array([10, 20, 20, 30, 30, 30])

In [3]: pd.Index(list('abc')).repeat([1, 2, 3])
Out[3]: Index(['a', 'b', 'b', 'c', 'c', 'c'], dtype='object')

left_repeat = self.left.repeat(repeats, **kwargs)
right_repeat = self.right.repeat(repeats, **kwargs)
@Appender(_extension_array_shared_docs['repeat'] % _shared_docs_kwargs)
def repeat(self, repeats, *args, **kwargs):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base class implementation works for IntervalArray as well, but a couple off-the-cuff timings had this implementation 1.5x - 2x faster.

@@ -292,22 +292,6 @@ def test_validate_inplace(self):
with pytest.raises(ValueError):
cat.sort_values(inplace=value)

def test_repeat(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted as these appear to be testing the same thing as the base class tests in tests/extension/base/methods.py

@@ -26,22 +26,6 @@ def left_right_dtypes(request):

class TestMethods(object):

@pytest.mark.parametrize('repeats', [0, 1, 5])
def test_repeat(self, left_right_dtypes, repeats):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted as these appear to be testing the same thing as the base class tests in tests/extension/base/methods.py

@@ -317,26 +317,6 @@ def test_shift(self):
# This is tested in test_arithmetic
pass

def test_repeat(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

condensed these into test_repeat_freqstr in test_period.py

pd.period_range('2001-01-01', periods=3, freq='2D'),
marks=pytest.mark.xfail(reason='GH 24391')),
pd.PeriodIndex(['2001-01', 'NaT', '2003-01'], freq='M')])
def test_repeat_freqstr(self, index, use_numpy):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed so this doesn't override the base class tests, which test some additional things (array of ints, errors)

@@ -445,17 +451,6 @@ def test_pindex_qaccess(self):
# Todo: fix these accessors!
assert s['05Q4'] == s[2]

def test_numpy_repeat(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted since this overrides the base class implementation and appears to be doing the same thing.

@codecov
Copy link

codecov bot commented Dec 22, 2018

Codecov Report

Merging #24395 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24395      +/-   ##
==========================================
+ Coverage    92.3%    92.3%   +<.01%     
==========================================
  Files         162      162              
  Lines       51875    51872       -3     
==========================================
- Hits        47883    47882       -1     
+ Misses       3992     3990       -2
Flag Coverage Δ
#multiple 90.71% <100%> (ø) ⬆️
#single 43% <71.42%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/series.py 93.71% <ø> (ø) ⬆️
pandas/core/arrays/period.py 98.48% <ø> (-0.02%) ⬇️
pandas/core/indexes/period.py 93.08% <ø> (-0.04%) ⬇️
pandas/core/arrays/base.py 97.54% <100%> (+0.01%) ⬆️
pandas/core/indexes/multi.py 95.58% <100%> (ø) ⬆️
pandas/core/arrays/interval.py 93.02% <100%> (+0.04%) ⬆️
pandas/core/arrays/categorical.py 95.32% <100%> (ø) ⬆️
pandas/core/indexes/base.py 96.27% <100%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 97.57% <100%> (+0.28%) ⬆️
pandas/util/testing.py 87.84% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e0358d...696c02a. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 22, 2018

Codecov Report

Merging #24395 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24395      +/-   ##
==========================================
+ Coverage    92.3%    92.3%   +<.01%     
==========================================
  Files         162      163       +1     
  Lines       51875    51943      +68     
==========================================
+ Hits        47883    47946      +63     
- Misses       3992     3997       +5
Flag Coverage Δ
#multiple 90.71% <100%> (ø) ⬆️
#single 43% <73.68%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/series.py 93.71% <ø> (ø) ⬆️
pandas/core/arrays/period.py 98.47% <ø> (-0.03%) ⬇️
pandas/core/indexes/period.py 93.08% <ø> (-0.04%) ⬇️
pandas/core/arrays/base.py 97.54% <100%> (+0.01%) ⬆️
pandas/core/indexes/multi.py 95.58% <100%> (ø) ⬆️
pandas/core/arrays/interval.py 93.02% <100%> (+0.04%) ⬆️
pandas/core/arrays/categorical.py 95.32% <100%> (ø) ⬆️
pandas/core/indexes/base.py 96.27% <100%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 97.58% <100%> (+0.29%) ⬆️
pandas/core/arrays/datetimes.py 97.77% <0%> (-0.47%) ⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e0358d...88ab285. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. just some questions on the docs.

@@ -1038,12 +1038,61 @@ def _set_values(self, key, value):

def repeat(self, repeats, *args, **kwargs):
"""
Repeat elements of an Series. Refer to `numpy.ndarray.repeat`
for more information about the `repeats` argument.
Repeat elements of a Series.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any way to re-use the doc-string you defined for EA (meaning here and for Index), maybe make it even more generic and parametrize on the types? (ok that we are duplicating code, but the more can share the better).
cc @datapythonista

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had the same thought but was unsure how to do it in a clean way. I think you could split the docstring to reuse to everything but the See Also and Examples section, then separately define only those sections for EA/Index/Series. Seems a bit convoluted though; certainly interested to hear if there's a better way.

2 c
2 c
dtype: object
>>> s.repeat(3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to have 2 very similar examples?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was mimicking the existing documentation for Index.repeat which follows the same pattern of examples. Agreed that the two are quite similar; changing the second example to be an array of ints example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ok, just thinking if this can be made simpler.

@jreback jreback merged commit fc7bc3f into pandas-dev:master Dec 23, 2018
@jreback
Copy link
Contributor

jreback commented Dec 23, 2018

thanks @jschendel

can probably use the _extension_array_shared_docs idiom elsewhere in DTA and friends, cc @jbrockmendel @TomAugspurger

@jschendel jschendel deleted the make-repeat-consistent branch December 23, 2018 23:44
@TomAugspurger
Copy link
Contributor

I think the signature should just be (self, repeats, axis=None). Opened #24413 for that.

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Clean Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants