Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes SparseSeries initiated with dictionary raising AttributeError #16960

Merged
merged 8 commits into from
Jul 19, 2017
Merged
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Groupby/Resample/Rolling

Sparse
^^^^^^

- Bug in SparseSeries raises AttributeError when a dicitonary is passed as data (:issue:`16777`)
Copy link
Member

@gfyoung gfyoung Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dicitonary --> dictionary
"passed as" --> "passed in as"



Reshaping
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/sparse/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,10 +146,10 @@ def __init__(self, data=None, index=None, sparse_index=None, kind='block',
data = data._data

elif isinstance(data, (Series, dict)):
data = Series(data)
if index is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I think you might be able to remove this if index is None bizeness (and simply use data.index if needed below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see what if anything breaks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the full test suite with this change and everything is working. I also manually ran a bunch of SparseSeries with different inputs (Series, dicts, with/without indexes) and it does work as intended.

index = data.index.view()

data = Series(data)
res = make_sparse(data, kind=kind, fill_value=fill_value)
data, sparse_index, fill_value = res

Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/sparse/test_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,12 @@ def setup_method(self, method):
fill_value=0)
self.ziseries2 = SparseSeries(arr, index=index, kind='integer',
fill_value=0)
def test_constructor_data_input(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to test_constructor_dict_input and move your "see" comment to right below the definition.

arr = SparseSeries({1: 1})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reference the issue right above this line (e.g. "# see gh-16905)"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice how you use {1: 1} multiple times. Why don't we refactor that as a variable and then use that variable instead throughout the tests. The same goes for your index=[0, 1, 2]. Set that as a variable.

assert arr.count() == len({1: 1})

arr = SparseSeries(pd.Series({1: 1}, index=[0, 1, 2]))
assert arr.count() == pd.Series({1: 1}, index=[0, 1, 2]).count()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would separate out your pd.Series({1: 1}, index=[0, 1, 2]).count() as a variable like "expected" so that it's a little clearer what the expected output is.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay : in light of what you just said, I think you can condense this test a bit and rename as follows:

def test_constructor_dict_input(self):
...
series = pd.Series(constructor_dict, index=index)
expected = SparseSeries(series, index=index)

result = SparseSeries(constructor_dict)
tm.assert_sp_series_equal(result, expected)

And delete everything else below it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I added this test and did find a case that was not handled correctly. I fixed the code and added tests for all options of SparseSeries creation.

def test_constructor_dtype(self):
arr = SparseSeries([np.nan, 1, 2, np.nan])
Expand All @@ -109,6 +115,9 @@ def test_constructor_dtype(self):
assert arr.dtype == np.int64
assert arr.fill_value == 0

arr = SparseSeries({1: 1})
assert arr.dtype == np.int64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look, I don't think you need this. You can remove.

def test_iteration_and_str(self):
[x for x in self.bseries]
str(self.bseries)
Expand Down