Skip to content

ENH: Do not require to sort entire DF if by option used in merge_asof #49816

Open
@filippzorin

Description

@filippzorin

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

As a pandas user I'd like to have following behavior when using merge_asof.
Right now it requires to sort entire DF, but it looks like there is no need to do that if by option used.
Let me try to explain with example:
If we have 2 dataframes:

main_df = pd.DataFrame({'id': ['1', '1', '1', '2', '2'], 'tracking': [1, 4, 7, 1, 5]})

  id  tracking
0  1         1
1  1         4
2  1         7
3  2         1
4  2         5

measurements = pd.DataFrame({'id': ['1', '1'], 'position': [2, 5], 'value': [100, 150]})

  id  position  value
0  1         2    100
1  1         5    150

And we want to use merge_asof to join them

pd.merge_asof(
    left=main_df, 
    right=measurements, 
    by='id', 
    left_on='tracking', 
    right_on='position', 
    direction='nearest')

Since left df is not sorted we face error:
ValueError: left keys must be sorted

So we need to sort left df first:

pd.merge_asof(
    left=main_df.sort_values('tracking'),
    right=measurements,
    by='id',
    left_on='tracking',
    right_on='position',
    direction='nearest'
)

But the sort order in result not so obvious as origin sort where each segment with given id was sorted independent.

0  1         1       2.0  100.0
1  2         1       NaN    NaN
2  1         4       5.0  150.0
3  2         5       NaN    NaN
4  1         7       5.0  150.0

Feature Description

It would be nice if pandas require sort only segment, defined by by argument in merge_asof function.

Alternative Solutions

Haven't seen any alternatives.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds DiscussionRequires discussion from core team before further actionReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions