Skip to content

BUG/Internals: maybe_upcast_putmask #23823

Closed
@h-vetinari

Description

@h-vetinari

In the context of #23192 (and #23604 / #23606), I want to use pandas.core.dtypes.cast.maybe_upcast_putmask, because it solves exactly the problem I need it to solve.

Unfortunately, it does not work as advertised (and I already found the culprit).
The docstring says:

def maybe_upcast_putmask(result, mask, other):
    """
    A safe version of putmask that potentially upcasts the result

    Parameters
    ----------
    result : ndarray
        The destination array. This will be mutated in-place if no upcasting is
        necessary.
    mask : boolean ndarray
    other : ndarray or scalar
        The source array or value

in other words, it expects result and other to be ndarrays. Curiously enough, in some branches, it only works for Series and produces wrong results for ndarray, e.g.

>>> import pandas as pd
>>> import numpy as np
>>> s = pd.Series([10, 11, 12])
>>> t = pd.Series([np.nan, 61, np.nan])
>>> from pandas.core.dtypes.cast import maybe_upcast_putmask
>>> result, _ = maybe_upcast_putmask(s, np.array([False, True, False]), t)
>>> result  # correct
0    10
1    61
2    12
dtype: int64
>>> result, _ = maybe_upcast_putmask(s.values, np.array([False, True, False]), t.values)
>>> result  # incorrect
array([10., nan, 12.])

This is because the code does

try:
    [...]
    new_result = result.values.copy()
    [...]
    return [...]
except: 
    # do something else

which actually expects a Series (since .values won't ever work on an ndarray).

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBugDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions