Description
import pandas as pd
df = pd.DataFrame(
{
"foo": [pd.Timestamp("2019"), pd.Timestamp("2020")],
"bar": [pd.Timestamp("2018"), pd.Timestamp("2021")],
}
)
df2 = df[["foo"]]
print(df - df2)
Problem description
The above snippet raises the following exception:
Traceback (most recent call last):
File ".venv/lib/python3.6/site-packages/pandas/core/ops/array_ops.py", line 149, in na_arithmetic_op
result = expressions.evaluate(op, str_rep, left, right)
File ".v
env/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 208, in evaluate
return _evaluate(op, op_str, a, b)
File ".venv/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 70, in _evaluate_standard
return op(a, b)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/common.py", line 64, in new_method
return method(self, other)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/__init__.py", line 500, in wrapper
result = arithmetic_op(lvalues, rvalues, op, str_rep)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/array_ops.py", line 192, in arithmetic_op
res_values = dispatch_to_extension_op(op, lvalues, rvalues)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/dispatch.py", line 125, in dispatch_to_extension_op
res_values = op(left, right)
File ".venv/lib/python3.6/site-packages/pandas/core/arrays/datetimelike.py", line 1390, in __rsub__
f"cannot subtract {type(self).__name__} from {type(other).__name__}"
TypeError: cannot subtract DatetimeArray from ndarray
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas_bug.py", line 36, in <module>
print(df2 - df)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/__init__.py", line 703, in f
new_data = left._combine_frame(right, pass_op, fill_value)
File ".venv/lib/python3.6/site-packages/pandas/core/frame.py", line 5297, in _combine_frame
new_data = ops.dispatch_to_series(self, other, _arith_op)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/__init__.py", line 416, in dispatch_to_series
new_data = expressions.evaluate(column_op, str_rep, left, right)
File ".venv/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 208, in evaluate
return _evaluate(op, op_str, a, b)
File ".venv/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 70, in _evaluate_standard
return op(a, b)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/__init__.py", line 385, in column_op
return {i: func(a.iloc[:, i], b.iloc[:, i]) for i in range(len(a.columns))}
File ".venv/lib/python3.6/site-packages/pandas/core/ops/__init__.py", line 385, in <dictcomp>
return {i: func(a.iloc[:, i], b.iloc[:, i]) for i in range(len(a.columns))}
File ".venv/lib/python3.6/site-packages/pandas/core/ops/array_ops.py", line 121, in na_op
return na_arithmetic_op(x, y, op, str_rep)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/array_ops.py", line 151, in na_arithmetic_op
result = masked_arith_op(left, right, op)
File ".venv/lib/python3.6/site-packages/pandas/core/ops/array_ops.py", line 75, in masked_arith_op
assert isinstance(x, np.ndarray), type(x)
This is a 1.0.0 regression; in 0.25.3, the operation succeeds and the unmatched bar
column is filled with NaN
in the output.
The same error occurs with:
- Any combination of incompatible columns (strict subset, strict superset, overlapping, disjoint)
- Calling the
subtract
method instead of using the subtraction operator - Timezone-aware
Timestamp
s as well as timezone-naive
It does not seem to occur with:
- Mismatches on the row index; transposing the dataframes in the above example prevents the errors occuring.
pd.Series
objects with mismatched indexes (e.g. calling the above on the first row of each dataframe works fine)- Other dtypes;
bool
,float
, andint
seem to work fine. Similarly, if the dataframes are explicitly cast to dtypeobject
, the operation succeeds.
Expected Output
bar foo
0 NaN 0 days
1 NaN 0 days
Output of pd.show_versions()
pandas : 1.0.0
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 41.6.0
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None
</details>