New feature/enhancement request: Merge dataframes where one value is between two others #10309
Description
In contrast to SQL, it is not currently possible to merge dataframes with inequality constraints. Could this be added?
As an example (which I previously posted on StackOverflow), I need to merge two pandas dataframes on an identifier and a condition where a date in one dataframe is between two dates in the other dataframe.
Dataframe A has a date ("fdate") and an ID ("cusip"):
I need to merge this with this dataframe B:
In SQL this would be trivial, but the only way I can see how to do this in pandas is to first merge unconditionally on the identifier, and then filter on the date condition:
df = pd.merge(A, B, how='inner', left_on='cusip', right_on='ncusip')
df = df[(df['fdate']>=df['namedt']) & (df['fdate']<=df['nameenddt'])]
The problem with this approach is that after the merge, but before the filtering step, df
can become unnecessarily large. Presumably this could be avoided if the filtering were done as part of the merge process.