-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: _flex_binary_moment() doesn't preserve column order or handle multiple columns with the same label #7738
BUG: _flex_binary_moment() doesn't preserve column order or handle multiple columns with the same label #7738
Conversation
I left the following unchanged for two distinct DataFrames with pairwise=False, as it's not obvious to me what the desired behavior is in this case.
|
results[i][j] = f(*_prep_binary(arg1[k1], arg2[k2])) | ||
p = Panel.from_dict(results).swapaxes('items', 'major') | ||
p.major_axis = arg1.columns[p.major_axis] | ||
p.minor_axis = arg2.columns[p.minor_axis] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reindex instead here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it correctly, reindex() will rearrange existing rows/columns (labels and values), whereas what I'm doing here is renaming the row/columns without changing the values. Note that I construct results
to be indexed by column index (rather than label), so that the order is maintained, and then I simply apply the labels without touching the values. I think if I use reindex() here I will just end up with all NaNs.
I updated the code to properly deal with multiple columns with the same label (except for the case of two distinct DataFrames and pairwise=False, where it's not obvious to me what the desired behavior would be). |
@@ -873,6 +873,122 @@ def test_expanding_corr_pairwise_diff_length(self): | |||
assert_frame_equal(result3, expected) | |||
assert_frame_equal(result4, expected) | |||
|
|||
def test_pairwise_stats_column_names_order(self): | |||
# GH 7542 | |||
df1_columns_ordered = DataFrame([[2,4],[1,2],[5,2],[8,1]], columns=[0,1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the ordering (in the current impl) has to do with the dtype mix. So can you do tests with say mixed int/floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added additional tests. Let me know if that's not what you had in mind.
need a release note for v0.15.0 in bug fixes |
@@ -278,7 +278,10 @@ Bug Fixes | |||
- Bug in ``DataFrame.plot`` with ``subplots=True`` may draw unnecessary minor xticks and yticks (:issue:`7801`) | |||
- Bug in ``StataReader`` which did not read variable labels in 117 files due to difference between Stata documentation and implementation (:issue:`7816`) | |||
|
|||
|
|||
- Bug in ``expanding_cov``, ``expanding_corr``, ``rolling_cov``, ``rolling_cov``, ``ewmcov``, and ``ewmcorr`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
say fixes the issue where...... and non-unique columns (instead of multiple columsn with the same name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Better now?
can't use a context manager under. 2.6 like that iirc just create a function |
How do I test the message in the ValueError? The following works for me on 3.4, but per https://docs.python.org/2.6/search.html?q=assertRaisesRegex&check_keywords=yes&area=default I suspect it doesn't work on 2.6.
|
their are lots of examples in the code, e.g. : https://github.com/pydata/pandas/blob/master/pandas/tests/test_index.py#L52 (and you can use a context manger). don't use the hmm...maybe should fix that |
OK, updated to use |
@seth-p I added them to the |
... |
ok, looks good....ping when green |
One of the Travis builds failed for no discernible-to-me reason: https://travis-ci.org/pydata/pandas/jobs/30798802.
|
sometimes this happens np (Travis is flaky ) |
Anything else I need to do, then? |
BUG: _flex_binary_moment() doesn't preserve column order or handle multiple columns with the same label
thanks! |
Closes #7542.