-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Fixes SparseSeries initiated with dictionary raising AttributeError #16960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
metllord
commented
Jul 15, 2017
- closes BUG: SparseSeries from dict inconsistency #16905
- tests added / passed
- whatsnew entry
|
Hello @metllord! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 17, 2017 at 14:18 Hours UTC |
doc/source/whatsnew/v0.21.0.txt
Outdated
| Sparse | ||
| ^^^^^^ | ||
|
|
||
| - Bug in SparseSeries raises AttributeError when a dicitonary is passed as data (:issue:`16777`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dicitonary --> dictionary
"passed as" --> "passed in as"
pandas/tests/sparse/test_series.py
Outdated
| self.ziseries2 = SparseSeries(arr, index=index, kind='integer', | ||
| fill_value=0) | ||
| def test_constructor_data_input(self): | ||
| arr = SparseSeries({1: 1}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reference the issue right above this line (e.g. "# see gh-16905)"?
pandas/tests/sparse/test_series.py
Outdated
| self.ziseries2 = SparseSeries(arr, index=index, kind='integer', | ||
| fill_value=0) | ||
| def test_constructor_data_input(self): | ||
| arr = SparseSeries({1: 1}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice how you use {1: 1} multiple times. Why don't we refactor that as a variable and then use that variable instead throughout the tests. The same goes for your index=[0, 1, 2]. Set that as a variable.
pandas/tests/sparse/test_series.py
Outdated
| assert arr.count() == len({1: 1}) | ||
|
|
||
| arr = SparseSeries(pd.Series({1: 1}, index=[0, 1, 2])) | ||
| assert arr.count() == pd.Series({1: 1}, index=[0, 1, 2]).count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would separate out your pd.Series({1: 1}, index=[0, 1, 2]).count() as a variable like "expected" so that it's a little clearer what the expected output is.
Codecov Report
@@ Coverage Diff @@
## master #16960 +/- ##
==========================================
- Coverage 90.98% 90.96% -0.02%
==========================================
Files 161 161
Lines 49288 49288
==========================================
- Hits 44846 44837 -9
- Misses 4442 4451 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16960 +/- ##
==========================================
- Coverage 90.98% 90.97% -0.02%
==========================================
Files 161 161
Lines 49288 49292 +4
==========================================
- Hits 44846 44844 -2
- Misses 4442 4448 +6
Continue to review full report at Codecov.
|
pandas/tests/sparse/test_series.py
Outdated
| # see gh-16905 | ||
| test_dict = {1: 1} | ||
| test_index = [0, 1, 2] | ||
| test_series = pd.Series({1: 1}, index=test_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace the {1: 1} with test_dict.
doc/source/whatsnew/v0.21.0.txt
Outdated
| Sparse | ||
| ^^^^^^ | ||
|
|
||
| - Bug in SparseSeries raises AttributeError when a dictionary is passed in as data (:issue:`16777`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add double ticks ("``") around both SparseSeries and `AttributeError`?
pandas/tests/sparse/test_series.py
Outdated
| arr = SparseSeries(test_dict) | ||
| assert arr.count() == len(test_dict) | ||
|
|
||
| arr = SparseSeries(test_series, index=test_index)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops: got an extra right parenthesis there
|
@metllord : Ping when green, and we will merge this! |
pandas/tests/sparse/test_series.py
Outdated
| assert arr.count() == len(test_dict) | ||
|
|
||
| arr = SparseSeries(test_series, index=test_index) | ||
| assert arr.count() == test_series.count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use assert_sp_series_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to use assert_series_equal since assert_sp_series_equal requires two sparse series, even if check_series_type=False.
This appears to result from line 1523 in util.testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you should do is construct your expected result via instantiation as a Series (i.e. pass the dict into the Series constructor). Then pass the Series object into the SparseSeries constructor. That becomes your expected output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second test is testing the creation of a SparseSeries from a Series. The SparseSeries constructor takes either a dict or a Series. Aside from class, the SparseSeries should be equal to the original Series.
pandas/tests/sparse/test_series.py
Outdated
| def test_constructor_data_input(self): | ||
| # see gh-16905 | ||
| test_dict = {1: 1} | ||
| test_index = [0, 1, 2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't name things with test_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would _test_dict, etc work? I saw that higher up in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @jreback is saying to come up with different names that don't use "test" at all.
pandas/tests/sparse/test_series.py
Outdated
| test_index = [0, 1, 2] | ||
| test_series = pd.Series(test_dict, index=test_index) | ||
|
|
||
| arr = SparseSeries(test_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result =
pandas/tests/sparse/test_series.py
Outdated
|
|
||
| arr = SparseSeries({1: 1}) | ||
| assert arr.dtype == np.int64 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second look, I don't think you need this. You can remove.
|
|
||
| result = SparseSeries(series, index=index) | ||
| tm.assert_series_equal(result, series, check_series_type=False) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay : in light of what you just said, I think you can condense this test a bit and rename as follows:
def test_constructor_dict_input(self):
...
series = pd.Series(constructor_dict, index=index)
expected = SparseSeries(series, index=index)
result = SparseSeries(constructor_dict)
tm.assert_sp_series_equal(result, expected)And delete everything else below it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. I added this test and did find a case that was not handled correctly. I fixed the code and added tests for all options of SparseSeries creation.
pandas/core/sparse/series.py
Outdated
|
|
||
| elif isinstance(data, (Series, dict)): | ||
| data = Series(data, index=index) | ||
| if index is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think you might be able to remove this if index is None bizeness (and simply use data.index if needed below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see what if anything breaks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the full test suite with this change and everything is working. I also manually ran a bunch of SparseSeries with different inputs (Series, dicts, with/without indexes) and it does work as intended.
pandas/tests/sparse/test_series.py
Outdated
| self.ziseries2 = SparseSeries(arr, index=index, kind='integer', | ||
| fill_value=0) | ||
|
|
||
| def test_constructor_data_input(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to test_constructor_dict_input and move your "see" comment to right below the definition.
| result = SparseSeries(constructor_dict) | ||
| tm.assert_sp_series_equal(result, expected) | ||
|
|
||
| # Series with index and dictionary with no index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this being tested? The result variable is the same as in line 98.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing input with a series, but that is test turned out to be redundant. I removed it from the most recent commit.
|
@jreback if you have anything else to add here. Otherwise, LGTM! |
|
this duplicates #16906 |
|
thanks @metllord |