Skip to content

ERR: warning on merging on unequal levels for an Index #13094

Open
@l736x

Description

@l736x

I found what seems to me a behavior that might be wrong when joining a simple index df with a multiindex df.
I'm admittedly not an SQL expert and I'm not sure this is really a bug.
Consider the case:

X = pd.DataFrame([[2, 3], [5, 7]], columns=['a','p']).set_index('a')
Y = pd.DataFrame([[1, 2, 3], [3, 4, 8], [5, 6, 9]],
                 columns=['a','b','c']).set_index(['a','b'])

X
#    p
# a
#2  3
#5  7

Y
#     c
# a b
#1 2  3
#3 4  8
#5 6  9

X.join(Y, how='left')
#      p  c
# a b
#5 6  7  9

I see the rational for not returning the line indexed by a=2 in X: since there is no value for the level b to associate to this line it is discarded.
But I'm wondering if this output would not be also reasonable:

      p  c
a b
2 Nan 3 Nan
5 6   7    9

In SQL the result of (where I replace X by X.reset_index() and similarly for Y):

select *
from X, Y
where Y.a = X.a (+)

would contain the line

a   b   p   c 
2 Nan   3 Nan

Any thought?

For completeness, the same issue is present for how='outer'.

Metadata

Metadata

Assignees

Labels

BugError ReportingIncorrect or improved errors from pandasReshapingConcat, Merge/Join, Stack/Unstack, Explode

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions