Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Series.plot.hist segfault for timedelta64 #43941

Open
2 of 3 tasks
cdeil opened this issue Oct 9, 2021 · 4 comments
Open
2 of 3 tasks

BUG: Series.plot.hist segfault for timedelta64 #43941

cdeil opened this issue Oct 9, 2021 · 4 comments
Assignees
Labels
Needs Tests Unit test(s) needed to prevent regressions

Comments

@cdeil
Copy link
Contributor

cdeil commented Oct 9, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd

t = pd.to_datetime(["2021-01-01", "2021-01-03", "2021-01-04"])
print(t)

dt = t.to_series().diff()
print(dt)

dt.plot.hist()

Issue Description

I get this crash from the dt.plot.hist() call:

% python crash.py
DatetimeIndex(['2021-01-01', '2021-01-03', '2021-01-04'], dtype='datetime64[ns]', freq=None)
2021-01-01      NaT
2021-01-03   2 days
2021-01-04   1 days
dtype: timedelta64[ns]
zsh: segmentation fault  python crash.py

Expected Behavior

No segfault. Give a plot or some informative exception.

Installed Versions

This is with Python 3.9 on MacOS installed via conda-forge.

% cat environment.yml
name: cement

channels:

  • conda-forge
  • nodefaults

dependencies:

  • python==3.9
  • jupyterlab==3.1
  • pandas==1.3
  • scikit-learn==1.0
  • missingno
  • matplotlib
  • seaborn==0.11
  • dtale==1.56
  • pandas-profiling
  • sweetviz==2.1
  • hcrystalball==0.1.10
  • pip
  • pip:
    • flaml

pd.show_versions()

INSTALLED VERSIONS

commit : f00ed8f
python : 3.9.0.final.0
python-bits : 64
OS : Darwin
OS-release : 20.6.0
Version : Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021; root:xnu-7195.141.6~3/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8

pandas : 1.3.0
numpy : 1.21.2
pytz : 2021.1
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.28.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.19.0
xlrd : 2.0.1
xlwt : None
numba : 0.53.1

@cdeil cdeil added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 9, 2021
@mzeitlin11
Copy link
Member

Thanks for reporting this @cdeil! Not sure if there are issues the pandas logic somewhere as well, but I can reproduce the segfault with only numpy with

td_arr = np.array([], dtype="timedelta64[ns]")
np.result_type(0, td_arr)

@mzeitlin11 mzeitlin11 added Segfault Non-Recoverable Error Timedelta Timedelta data type Upstream issue Issue related to pandas dependency Visualization plotting and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 9, 2021
@mzeitlin11
Copy link
Member

Issue fixed upstream (very quickly!) - we should probably add a test, but need to wait until 1.21.3 released.

@mzeitlin11 mzeitlin11 added Needs Tests Unit test(s) needed to prevent regressions and removed Bug labels Oct 11, 2021
@simonjayhawkins
Copy link
Member

Issue fixed upstream (very quickly!) - we should probably add a test, but need to wait until 1.21.3 released.

just needs a test to close

@simonjayhawkins simonjayhawkins removed Visualization plotting Timedelta Timedelta data type Segfault Non-Recoverable Error Upstream issue Issue related to pandas dependency labels Aug 5, 2022
@DriesSchaumont DriesSchaumont self-assigned this Sep 4, 2022
@DriesSchaumont
Copy link
Member

@simonjayhawkins unfortunatly, the provided example does not work.

>>> import pandas as pd
>>> t = pd.to_datetime(["2021-01-01", "2021-01-03", "2021-01-04"])
>>> dt = t.to_series().diff()
>>> print(dt)
2021-01-01      NaT
2021-01-03   2 days
2021-01-04   1 days
dtype: timedelta64[ns]
>>> dt.plot.hist()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/plotting/_core.py", line 1375, in hist
    return self(kind="hist", by=by, bins=bins, **kwargs)
  File "pandas/plotting/_core.py", line 1001, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File "pandas/plotting/_matplotlib/__init__.py", line 71, in plot
    plot_obj.generate()
  File "pandas/plotting/_matplotlib/core.py", line 448, in generate
    self._args_adjust()
  File "pandas/plotting/_matplotlib/hist.py", line 74, in _args_adjust
    self.bins = self._calculate_bins(self.data)
  File "pandas/plotting/_matplotlib/hist.py", line 85, in _calculate_bins
    hist, bins = np.histogram(
  File "<__array_function__ internals>", line 180, in histogram
  File "/numpy/lib/histograms.py", line 793, in histogram
    bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
  File "numpy/lib/histograms.py", line 443, in _get_bin_edges
    bin_type = np.result_type(bin_type, float)
  File "<__array_function__ internals>", line 180, in result_type
TypeError: The DType <class 'numpy.dtype[timedelta64]'> could not be promoted by <class 'numpy.dtype[float64]'>. This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is `object`. The full list of DTypes is: (<class 'numpy.dtype[timedelta64]'>, <class 'numpy.dtype[float64]'>)

Does this work as intended? Or should a separate issue be created?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

No branches or pull requests

4 participants