Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Native conditionals (if, elif, elif, else like np.select) #51093

Closed
1 of 3 tasks
mattharrison opened this issue Jan 31, 2023 · 2 comments
Closed
1 of 3 tasks

ENH: Native conditionals (if, elif, elif, else like np.select) #51093

mattharrison opened this issue Jan 31, 2023 · 2 comments
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@mattharrison
Copy link

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

If statements are a pain in Pandas. The .where method is confusing, and most folks resort to writing a function and using .apply to use old-school Python style conditionals.

Also, I'm not a fan of using sequences of .loc assignment because it breaks chaining.

I've resorted to using np.select. It would be great to have this as native functionality.

Consider calculating On-balance Volume (OBV)
Screen Shot 2023-01-31 at 11 41 00 AM

Many resort to this:

# naive

def calc_obv(df):
    df = df.copy()
    df["OBV"] = 0.0

    # Loop through the data and calculate OBV
    for i in range(1, len(df)):
        if df["Close"][i] > df["Close"][i - 1]:
            df["OBV"][i] = df["OBV"][i - 1] + df["Volume"][i]
        elif df["Close"][i] < df["Close"][i - 1]:
            df["OBV"][i] = df["OBV"][i - 1] - df["Volume"][i]
        else:
            df["OBV"][i] = df["OBV"][i - 1]  
    return df

calc_obv(aapl)

Here's my attempt with .where:

# This is painful
(aapl
 .assign(close_prev=aapl.Close.shift(1),
         vol=0,
         obv=lambda adf: adf.vol.where(cond=adf.Close == adf.close_prev, 
                                       other=adf.Volume.where(cond=adf.Close > adf.close_prev, 
                                                  other=-adf.Volume.where(cond=adf.Close < adf.close_prev, other=0)
                                        )).cumsum()
        )
)

Here's my np.select version:

(aapl
 .assign(vol=np.select([aapl.Close > aapl.Close.shift(1), 
                        aapl.Close == aapl.Close.shift(1), 
                        aapl.Close < aapl.Close.shift(1)],
                       [aapl.Volume, 0, -aapl.Volume]),
         obv=lambda df_:df_.vol.cumsum(),
        )
)

Feature Description

New method (adapted from NumPy):

def select(self, condlist, choicelist, default=0):
    """
    Return an array drawn from elements in choicelist, depending on conditions.

    Parameters
    ----------
    condlist : list of bool ndarrays
        The list of conditions which determine from which array in `choicelist`
        the output elements are taken. When multiple conditions are satisfied,
        the first one encountered in `condlist` is used.
    choicelist : list of ndarrays
        The list of arrays from which the output elements are taken. It has
        to be of the same length as `condlist`.
    default : scalar, optional
        The element inserted in `output` when all conditions evaluate to False.

    Returns
    -------
    output : Dataframe/Series
        The output at position m is the m-th element of the array in
        `choicelist` where the m-th element of the corresponding array in
        `condlist` is True.
    """
    return np.select(condlist, choicelist, default)

Alternative Solutions

Additional Context

No response

@mattharrison mattharrison added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 31, 2023
@erfannariman
Copy link
Member

Duplicate of: #39154

@phofl
Copy link
Member

phofl commented Feb 2, 2023

Thx, lets keep the discussion focused there

@phofl phofl closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

3 participants