Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
As a pandas user I'd like to have following behavior when using merge_asof
.
Right now it requires to sort entire DF, but it looks like there is no need to do that if by
option used.
Let me try to explain with example:
If we have 2 dataframes:
main_df = pd.DataFrame({'id': ['1', '1', '1', '2', '2'], 'tracking': [1, 4, 7, 1, 5]})
id tracking
0 1 1
1 1 4
2 1 7
3 2 1
4 2 5
measurements = pd.DataFrame({'id': ['1', '1'], 'position': [2, 5], 'value': [100, 150]})
id position value
0 1 2 100
1 1 5 150
And we want to use merge_asof
to join them
pd.merge_asof(
left=main_df,
right=measurements,
by='id',
left_on='tracking',
right_on='position',
direction='nearest')
Since left df is not sorted we face error:
ValueError: left keys must be sorted
So we need to sort left df first:
pd.merge_asof(
left=main_df.sort_values('tracking'),
right=measurements,
by='id',
left_on='tracking',
right_on='position',
direction='nearest'
)
But the sort order in result not so obvious as origin sort where each segment with given id
was sorted independent.
0 1 1 2.0 100.0
1 2 1 NaN NaN
2 1 4 5.0 150.0
3 2 5 NaN NaN
4 1 7 5.0 150.0
Feature Description
It would be nice if pandas require sort only segment, defined by by
argument in merge_asof
function.
Alternative Solutions
Haven't seen any alternatives.
Additional Context
No response