Skip to content

TST/CLN: break up & parametrize tests for df.set_index #22236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Sep 15, 2018
Prev Previous commit
Next Next commit
Refactor box constructors
  • Loading branch information
h-vetinari committed Sep 15, 2018
commit 81fd3fdafcf774952a171ab68ca7681c7fa0aa13
32 changes: 24 additions & 8 deletions pandas/tests/frame/test_alter_axes.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,6 @@

from pandas.tests.frame.common import TestData

key = lambda x: x.name
mi = lambda x: MultiIndex.from_arrays([x])


class TestDataFrameAlterAxes(TestData):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you shouldn't need TestData any longer


Expand Down Expand Up @@ -116,14 +113,18 @@ def test_set_index_after_mutation(self):
tm.assert_frame_equal(result, expected)

# also test index name if append=True (name is duplicate here for B)
@pytest.mark.parametrize('box', [Series, Index, np.array, mi])
@pytest.mark.parametrize('box', [Series, Index, np.array, 'MultiIndex'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont' use a string, just directly put the lambda here. ideally we have NO if/else in test functions, sometimes its unavoidable, but making tests as simple as possible is the key

@pytest.mark.parametrize('append, index_name', [(True, None),
(True, 'B'), (True, 'test'), (False, None)])
@pytest.mark.parametrize('drop', [True, False])
def test_set_index_pass_single_array(self, drop, append, index_name, box):
df = self.dummy.copy()
df.index.name = index_name

# update constructor in case of MultiIndex
box = ((lambda x: MultiIndex.from_arrays([x]))
if box == 'MultiIndex' else box)

key = box(df['B'])
# np.array and list "forget" the name of B
name = [None if box in [np.array, list] else 'B']
Expand All @@ -138,7 +139,8 @@ def test_set_index_pass_single_array(self, drop, append, index_name, box):
tm.assert_frame_equal(result, expected)

# also test index name if append=True (name is duplicate here for A & B)
@pytest.mark.parametrize('box', [Series, Index, np.array, list, mi])
@pytest.mark.parametrize('box', [Series, Index, np.array,
list, 'MultiIndex'])
@pytest.mark.parametrize('append, index_name',
[(True, None), (True, 'A'), (True, 'B'),
(True, 'test'), (False, None)])
Expand All @@ -147,6 +149,10 @@ def test_set_index_pass_arrays(self, drop, append, index_name, box):
df = self.dummy.copy()
df.index.name = index_name

# update constructor in case of MultiIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment

box = ((lambda x: MultiIndex.from_arrays([x]))
if box == 'MultiIndex' else box)

keys = ['A', box(df['B'])]
# np.array and list "forget" the name of B
names = ['A', None if box in [np.array, list] else 'B']
Expand All @@ -162,8 +168,10 @@ def test_set_index_pass_arrays(self, drop, append, index_name, box):
tm.assert_frame_equal(result, expected)

# also test index name if append=True (name is duplicate here for A)
@pytest.mark.parametrize('box1', [key, Series, Index, np.array, list, mi])
@pytest.mark.parametrize('box2', [key, Series, Index, np.array, list, mi])
@pytest.mark.parametrize('box1', ['label', Series, Index, np.array,
list, 'MultiIndex'])
@pytest.mark.parametrize('box2', ['label', Series, Index, np.array,
list, 'MultiIndex'])
@pytest.mark.parametrize('append, index_name', [(True, None),
(True, 'A'), (True, 'test'), (False, None)])
@pytest.mark.parametrize('drop', [True, False])
Expand All @@ -172,7 +180,15 @@ def test_set_index_pass_arrays_duplicate(self, drop, append, index_name,
df = self.dummy.copy()
df.index.name = index_name

keys = [box1(df['A']), box2(df['A'])]
# transform strings to correct box constructor
def rebox(x):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not good at all, pls don't do this inside the test function, just use a lambda in the box itself

if x == 'label':
return lambda x: x.name
elif x == 'MultiIndex':
return lambda x: MultiIndex.from_arrays([x])
return x

keys = [rebox(box1)(df['A']), rebox(box2)(df['A'])]

# == gives ambiguous Boolean for Series
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? was this here before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said

strongly parametrized test_set_index_pass_arrays and test_set_index_duplicate_names, including several combinations and cases that were not tested before

This is (essentially) a very beefed-up version of test_set_index_duplicate_names. It test appending duplicate arrays in various forms (and with various kwargs, e.g. against drop), and tests for the error of passing duplicate column keys directly.

if keys[0] is 'A' and keys[1] is 'A':
Expand Down