-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
TST/CLN: correctly skip in indexes/common; add test for duplicated #21902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,7 +37,7 @@ def verify_pickle(self, indices): | |
def test_pickle_compat_construction(self): | ||
# this is testing for pickle compat | ||
if self._holder is None: | ||
return | ||
pytest.skip('Skip check for uncertain type') | ||
|
||
# need an object to create with | ||
pytest.raises(TypeError, self._holder) | ||
|
@@ -236,7 +236,7 @@ def test_set_name_methods(self, indices): | |
|
||
# don't tests a MultiIndex here (as its tested separated) | ||
if isinstance(indices, MultiIndex): | ||
return | ||
pytest.skip('Skip check for MultiIndex') | ||
original_name = indices.name | ||
new_ind = indices.set_names([new_name]) | ||
assert new_ind.name == new_name | ||
|
@@ -333,7 +333,8 @@ def test_copy_and_deepcopy(self, indices): | |
from copy import copy, deepcopy | ||
|
||
if isinstance(indices, MultiIndex): | ||
return | ||
pytest.skip('Skip check for MultiIndex') | ||
|
||
for func in (copy, deepcopy): | ||
idx_copy = func(indices) | ||
assert idx_copy is not indices | ||
|
@@ -344,18 +345,46 @@ def test_copy_and_deepcopy(self, indices): | |
|
||
def test_duplicates(self, indices): | ||
if type(indices) is not self._holder: | ||
return | ||
pytest.skip('Can only check if we have the correct type') | ||
if not len(indices) or isinstance(indices, MultiIndex): | ||
return | ||
# MultiIndex tested separately in: | ||
# tests/indexes/multi/test_unique_and_duplicates | ||
pytest.skip('Skip check for empty Index and MultiIndex') | ||
|
||
idx = self._holder([indices[0]] * 5) | ||
assert not idx.is_unique | ||
assert idx.has_duplicates | ||
|
||
@pytest.mark.parametrize('keep', ['first', 'last', False]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this a duplicate these (pun intended) or are not testing indices duplicated currently? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so these are not relevant?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback No, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you point to the coverage that shows this is NOT tested? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback I never said that it is not tested, just that it is only tested implicitly. Any call to |
||
def test_duplicated(self, indices, keep): | ||
if type(indices) is not self._holder: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you use isinstance here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. isinstance of self._holder, is this what we do elsewhere? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback If I change
to
I get 9 failures (instead of skips), all of which are from
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback Seem to have found something: in my normal workbook, I get
but for some reason, within
Same happens for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Opened a follow-up: #22211 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nevermind, I had gotten confused between instances and classes. So In any case, under these circumstances, I'm even more convinced that it's best to just stay with the |
||
pytest.skip('Can only check if we know the index type') | ||
if not len(indices) or isinstance(indices, MultiIndex): | ||
# MultiIndex tested separately in: | ||
# tests/indexes/multi/test_unique_and_duplicates | ||
pytest.skip('Skip check for empty Index and MultiIndex') | ||
|
||
idx = self._holder(indices) | ||
if idx.has_duplicates: | ||
# We need to be able to control creation of duplicates here | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this comment is a bit obtuse, can you reword |
||
# This is slightly circular, as drop_duplicates depends on | ||
# duplicated, but in the end, it all works out because we | ||
# cross-check with Series.duplicated | ||
idx = idx.drop_duplicates() | ||
|
||
n, k = len(idx), 10 | ||
duplicated_selection = np.random.choice(n, k * n) | ||
expected = pd.Series(duplicated_selection).duplicated(keep=keep).values | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I hate to check as a numpy array, much prefer to check the type and use assert_index_equal or assert_series_equal. is this how the other tests are? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Um, All the manual |
||
idx = self._holder(idx.values[duplicated_selection]) | ||
|
||
result = idx.duplicated(keep=keep) | ||
tm.assert_numpy_array_equal(result, expected) | ||
|
||
def test_unique(self, indices): | ||
# don't test a MultiIndex here (as its tested separated) | ||
# don't test a CategoricalIndex because categories change (GH 18291) | ||
if isinstance(indices, (MultiIndex, CategoricalIndex)): | ||
return | ||
pytest.skip('Skip check for MultiIndex/CategoricalIndex') | ||
|
||
# GH 17896 | ||
expected = indices.drop_duplicates() | ||
|
@@ -375,7 +404,7 @@ def test_unique_na(self): | |
def test_get_unique_index(self, indices): | ||
# MultiIndex tested separately | ||
if not len(indices) or isinstance(indices, MultiIndex): | ||
return | ||
pytest.skip('Skip check for empty Index and MultiIndex') | ||
|
||
idx = indices[[0] * 5] | ||
idx_unique = indices[[0]] | ||
|
@@ -394,7 +423,7 @@ def test_get_unique_index(self, indices): | |
|
||
# nans: | ||
if not indices._can_hold_na: | ||
return | ||
pytest.skip('Skip na-check if index cannot hold na') | ||
|
||
if needs_i8_conversion(indices): | ||
vals = indices.asi8[[0] * 5] | ||
|
@@ -423,7 +452,7 @@ def test_sort(self, indices): | |
|
||
def test_mutability(self, indices): | ||
if not len(indices): | ||
return | ||
pytest.skip('Skip check for empty Index') | ||
pytest.raises(TypeError, indices.__setitem__, 0, indices[0]) | ||
|
||
def test_view(self, indices): | ||
|
@@ -761,7 +790,7 @@ def test_equals_op(self): | |
# GH9947, GH10637 | ||
index_a = self.create_index() | ||
if isinstance(index_a, PeriodIndex): | ||
return | ||
pytest.skip('Skip check for PeriodIndex') | ||
|
||
n = len(index_a) | ||
index_b = index_a[0:-1] | ||
|
@@ -989,11 +1018,11 @@ def test_searchsorted_monotonic(self, indices): | |
# not implemented for tuple searches in MultiIndex | ||
# or Intervals searches in IntervalIndex | ||
if isinstance(indices, (MultiIndex, IntervalIndex)): | ||
return | ||
pytest.skip('Skip check for MultiIndex/IntervalIndex') | ||
|
||
# nothing to test if the index is empty | ||
if indices.empty: | ||
return | ||
pytest.skip('Skip check for empty Index') | ||
value = indices[0] | ||
|
||
# determine the expected results (handle dupes for 'right') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this actually hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I didn't write these tests, and the inner workings of
indexes/common.py
are not immediately apparent (which files call it, what do they fillindices
with, etc.).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to leave the bare return in there, if that's preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, put a breakpoint there and see if you would.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to do work on any of this for 2 weeks now. I'll have a look after.