API / BUG: How do we differentiate between -9223372036854775808 and iNaT?

From #3707 (at <a href="https://github.com/pandas-dev/pandas/commit/0281886fbdf0d1837c2af08af15949c1d98bf612">028188</a>):

~~~python
from datetime import datetime
from pandas import DataFrame

import numpy as np

max_int = np.iinfo(np.int64).max
min_int = np.iinfo(np.int64).min

df = DataFrame([max_int, min_int], index=[datetime(2013, 1, 1), datetime(2013, 1, 1)])
assert df.resample("M").apply(np.sum)[0][0] == -1
...
AssertionError
~~~

The assertion error occurs because during the aggregation, `pandas` checks in `cython_operation` in `core/groupby.py` via `_is_cython_func` from `core/base.py` whether there are any "missing" integer values (assuming the data is `integer`) before and after the aggregation, which are defined as `iNaT = -9223372036854775808`.  If there are any such values, we automatically cast the data to `float`.

This logic is quite prevalent in the codebase, but it does seem quite fraught with pitfalls.  For example, what if the output of a computation got the value `-9223372036854775808` ?  Also, what if the user intended to use `-9223372036854775808` as a legitimate data point?

Unlikely, sure.  But reasonable, absolutely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API / BUG: How do we differentiate between -9223372036854775808 and iNaT? #16674

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API / BUG: How do we differentiate between -9223372036854775808 and iNaT? #16674

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions