Skip to content
This repository was archived by the owner on Apr 10, 2024. It is now read-only.
This repository was archived by the owner on Apr 10, 2024. It is now read-only.

Unified merge API #31

Open
Open
@chrisaycock

Description

@chrisaycock

We have merge() and merge_asof(). There may even come a time when we perform functions on overlapping columns. As someone who wants to join two tables together, I just want a single mechanism to do so.

I wonder if it's possible to have a single API like:

merge(
    left,     # DataFrame or Table
    right,    # DataFrame or Table
    on,       # one or more columns
    asof,     # one or more columns
    how,      # 'left', 'right', 'inner', 'outer'
    overlap,  # optional function to apply to overlapping column names
)

Users must specify at least one of on or asof. There can also be left_on/right_on and left_asof/right_asof. We could even have left_index/right_index for the poor souls who still have indexed data (#17).

The overlap is for when the same column name appears in both tables. Currently those columns are renamed with a suffix (though I'd be in favor of just raising an error). But there are a times when I want to perform a function. There are ways to do this with arithmetic operations (#30), though I think any function with two arguments would be nice, including overwritting the left with the right (for handling cases of missing data with a "fill" result).

Note that doesn't handle my proposed merge_window() (pandas-dev/pandas#13959). The semantics there are very specific and I'm not sure how to put that in a unified structure as with above, though I'd love to hear any ideas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions