Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERR: warning on merging on unequal levels for an Index #13094

Open
l736x opened this issue May 5, 2016 · 6 comments
Open

ERR: warning on merging on unequal levels for an Index #13094

l736x opened this issue May 5, 2016 · 6 comments
Assignees
Labels
Bug Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@l736x
Copy link
Contributor

l736x commented May 5, 2016

I found what seems to me a behavior that might be wrong when joining a simple index df with a multiindex df.
I'm admittedly not an SQL expert and I'm not sure this is really a bug.
Consider the case:

X = pd.DataFrame([[2, 3], [5, 7]], columns=['a','p']).set_index('a')
Y = pd.DataFrame([[1, 2, 3], [3, 4, 8], [5, 6, 9]],
                 columns=['a','b','c']).set_index(['a','b'])

X
#    p
# a
#2  3
#5  7

Y
#     c
# a b
#1 2  3
#3 4  8
#5 6  9

X.join(Y, how='left')
#      p  c
# a b
#5 6  7  9

I see the rational for not returning the line indexed by a=2 in X: since there is no value for the level b to associate to this line it is discarded.
But I'm wondering if this output would not be also reasonable:

      p  c
a b
2 Nan 3 Nan
5 6   7    9

In SQL the result of (where I replace X by X.reset_index() and similarly for Y):

select *
from X, Y
where Y.a = X.a (+)

would contain the line

a   b   p   c 
2 Nan   3 Nan

Any thought?

For completeness, the same issue is present for how='outer'.

@jreback
Copy link
Contributor

jreback commented May 5, 2016

Join only merges on matching index levels, you only have a single level that matches. This is as expected and well defined. I suppose its possible

In [12]: pd.merge(X.reset_index(), Y.reset_index(), on='a').set_index(['a','b'])
Out[12]: 
     p  c
a b      
5 6  7  9

I think this should actually trigger the warning that was added in 0.18.1, see here

@jreback
Copy link
Contributor

jreback commented May 5, 2016

cc @nbonnotte

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label May 5, 2016
@jreback
Copy link
Contributor

jreback commented May 5, 2016

Actually the warning IS buggy. want to do a PR to fix? (the behavior IS correct, but user's should be warned that they are not merging on all levels)
https://github.com/pydata/pandas/pull/12219/files#r62213837

@jreback jreback added Bug Difficulty Novice Error Reporting Incorrect or improved errors from pandas labels May 5, 2016
@jreback jreback added this to the 0.18.2 milestone May 5, 2016
@jreback jreback changed the title Is join working as expected with multiindexes ERR: warning on merging on unequal levels for an Index May 5, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 18, 2016
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@SeeminSyed
Copy link

take

@SeeminSyed
Copy link

@jreback I know this issue is a few years old but could I get some clarification on when the warning is supposed to trigger or what the bug in said warning refers to?

When you say levels are unequal, what do you mean, because to my understanding, the cases above should not trigger the warning at all?

@SeeminSyed
Copy link

We have assumed that the issue relates to index.nlevels and have added a check for comparisons between left and right index nlevels that trigger the warning.

abeldeb added a commit to CSCD01-team01/pandas that referenced this issue Mar 11, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants