Open
Description
Is your feature request related to a problem?
I want to be able to merge two DataFrames, but keep the index of the left one in the final result:
>>> import pandas as pd
>>> import string
>>> df1 = pd.DataFrame({"a": range(5), "b": range(10, 15)}, index=list(string.ascii_lowercase[:5]))
>>> df2 = pd.DataFrame({"a": range(5), "c": list(string.ascii_uppercase[:5])})
>>> df1
a b
a 0 10
b 1 11
c 2 12
d 3 13
e 4 14
>>> df2
a c
0 0 A
1 1 B
2 2 C
3 3 D
4 4 E
The current merge behaviour is to just drop the index entirely:
>>> df1.merge(df2, on="a")
a b c
0 0 10 A
1 1 11 B
2 2 12 C
3 3 13 D
4 4 14 E
Describe the solution you'd like
We add a new parameter preserve_index
to merge
, which takes either "left"
, "right"
, or None
DataFrame.merge(preserve_index="left")
In my above example, this would work like:
>>> df1.merge(df2, on="a", preserve_index="left")
a b c
a 0 10 A
b 1 11 B
c 2 12 C
d 3 13 D
e 4 14 E
API breaking implications
None. This is a new parameter, and if it is not provided the API is identical.
Describe alternatives you've considered
It is already possible to work around this by resetting the index and then setting it as an index again, as described here but this is:
- More verbose
- Not intuitive or clear to users (hence the StackOverflow question's popularity)
- Probably less efficient