ENH: Allow join based on conditional statement

In pandas, it is only possible to perform joins based on conditional statements that consider whether they provided keys are *equal*. It would be great if pandas provided a way in which joins can be performed based on some other logical condition. This is possible in many SQL engines and allows for elegant solution of specific use cases.

# Examples
## Setup
```python
df1 = pd.DataFrame({'col_a': [1,2,3], 'col_b': ["A", "B", "C"]})
print(df1)
```
```
   col_a col_b
0      1     A
1      2     B
2      3     C
```
```python
df2 = pd.DataFrame({'col_a': [0, 2, 3], 'col_c': ["Z", "X", "Y"]})
print(df2)
```
```
   col_a col_c
0      0     Z
1      2     X
2      3     Y
```
## Current behavior (equal joins)
Based on conditional ``df1.col_a == df2.col_b``
```python
df_equal = pd.merge(left=df1, right=df2, on="col_a")
print(df_equal)
```
```
   col_a col_b col_c
0      2     B     X
1      3     C     Y
```

## Desired behavior 
### Not equal to join
Based on conditional `df1.col_a != df2.col_a`

```
    col_a   col_b   col_c
0	1	A	Z
1 	1	A	Y
2	1	A	X
3	2	B	Z
4	2	B	X
5	3	C	Z
6	3	C	Y
```

### Greater than join
Based on conditional `df1.col_a > df2.col_a`

```
    col_a   col_b   col_c
0	1	A	Z
1	2	B	Z
2	3	C	Z
3	3	C	Y
```
And any other conditional based on logic operators. 

# Some considerations
Currently, it it is possible with pandas to join on multiple keys. It would be nice if it is possible to provide multiple logical conditions to perform joining on multiple keys, and allow "mixing" of the logical conditions (i.e. join on `df1.col_a != df2.col_a & df1.col_b == df1.col_c`). Note also that the second conditional is applied to columns of different names (i.e. `col_b` of `df1` and `col_c` of `df2`). 

## Real-world scenario
Such joins can be very useful in dataframes that have datetime-like objects stored in them. Consider a dataframe `df_process`  that contains `process_id` `process_start_date` and `process_end_date` columns for some business process, and a second dataframe `df_events` that contains a `event_id`, `event_date` column for particular events. If you want to find all events that occur during some business process, you can easily obtain these by applying the conditional join statement:
`df_event.event_date >= df_process.process_start_date & df_event.event_date <= df_process.process_end_date`

# Implementation suggestion
I'm not familiar with how joins are performed in pandas at the moment, but it seems to me that some equality check must be performed to decide to join particular records from the dataframes involved in the join. That means (if this is correct) that there is already some conditional built-in that evaluates to `True` or `False` based on the equality (`==`) comparison. My request should allow passing a different conditional to this part of the pandas join machinery.

Hopefully it's clear what I am requesting! 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Allow join based on conditional statement #34543

Examples

Setup

Current behavior (equal joins)

Desired behavior

Not equal to join

Greater than join

Some considerations

Real-world scenario

Implementation suggestion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Allow join based on conditional statement #34543

Description

Examples

Setup

Current behavior (equal joins)

Desired behavior

Not equal to join

Greater than join

Some considerations

Real-world scenario

Implementation suggestion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions