Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A regression in numpy-to-series conversion #16522

Closed
2 tasks done
KDruzhkin opened this issue May 27, 2024 · 3 comments
Closed
2 tasks done

A regression in numpy-to-series conversion #16522

KDruzhkin opened this issue May 27, 2024 · 3 comments
Labels
A-input-parsing Area: parsing input arguments bug Something isn't working P-medium Priority: medium python Related to Python Polars regression Issue introduced by a new release

Comments

@KDruzhkin
Copy link
Contributor

KDruzhkin commented May 27, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import numpy as np
import polars as pl

empty_array = np.array([], dtype=np.int32)

pl.Series(
    name="word_id",
    values=[empty_array, empty_array],
    dtype=pl.List(pl.Int32),
)

Log output

Traceback (most recent call last):
  File "tmp.py", line 6, in <module>
    pl.Series(
  File ".venv/lib/python3.11/site-packages/polars/series/series.py", line 316, in __init__
    self._s = sequence_to_pyseries(
              ^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/polars/_utils/construction/series.py", line 251, in sequence_to_pyseries
    return numpy_to_pyseries(
           ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/polars/_utils/construction/series.py", line 514, in numpy_to_pyseries
    return wrap_s(py_s).reshape(original_shape)._s
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/polars/series/series.py", line 7005, in reshape
    return self._from_pyseries(self._s.reshape(dimensions, is_list))
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ComputeError: cannot reshape len 0 into shape [2, 0]

Issue description

The releases 0.20.27 - 0.20.29 have been yanked due to a bug in numpy processing.
Here is a bug in numpy-related code in 0.20.30.

Expected behavior

As is 0.20.26: a Series of empty Lists.

Installed versions

--------Version info---------
Polars:               0.20.30
Index type:           UInt32
Platform:             Linux-5.15.0-107-generic-x86_64-with-glibc2.31
Python:               3.11.9 (main, Apr  6 2024, 17:59:24) [GCC 9.4.0]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               2024.3.1
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           3.9.0
nest_asyncio:         1.6.0
numpy:                1.26.4
openpyxl:             <not installed>
pandas:               2.2.2
pyarrow:              16.1.0
pydantic:             2.7.1
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
torch:                2.2.2+cu121
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@KDruzhkin KDruzhkin added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels May 27, 2024
@stinodego stinodego added regression Issue introduced by a new release P-medium Priority: medium A-input-parsing Area: parsing input arguments and removed needs triage Awaiting prioritization by a maintainer labels May 27, 2024
@KDruzhkin
Copy link
Contributor Author

KDruzhkin commented Jul 17, 2024

With polars 1.2.0 (and python 3.12, numpy 2.0) the error message is more meaningful:
cannot reshape array into shape containing a zero dimension after the first: (2, 0)

Traceback (most recent call last):
  File "src/example.py", line 6, in <module>
    pl.Series(
  File ".venv/lib/python3.12/site-packages/polars/series/series.py", line 288, in __init__
    self._s = sequence_to_pyseries(
              ^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/polars/_utils/construction/series.py", line 227, in sequence_to_pyseries
    return numpy_to_pyseries(
           ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/polars/_utils/construction/series.py", line 478, in numpy_to_pyseries
    return wrap_s(py_s).reshape(original_shape)._s
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/polars/series/series.py", line 6779, in reshape
    return self._from_pyseries(self._s.reshape(dimensions))
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.InvalidOperationError:
cannot reshape array into shape containing a zero dimension after the first: (2, 0)

@ADraginda
Copy link

FWIW v1.2.1 (+py39 +numpy1.26.4) - issue persists

pl.DataFrame({'empty_arrays': [np.array([]), np.array([])]})
polars.exceptions.InvalidOperationError: cannot reshape array into shape containing a zero dimension after the first: (2, 0)

@hexane360
Copy link

As of #18940 this is fixed:

>>> pl.DataFrame({'empty_arrays': [np.array([]), np.array([])]})
shape: (2, 1)
┌───────────────┐
│ empty_arrays  │
│ ---           │
│ array[f64, 0] │
╞═══════════════╡
│ []            │
│ []            │
└───────────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-input-parsing Area: parsing input arguments bug Something isn't working P-medium Priority: medium python Related to Python Polars regression Issue introduced by a new release
Projects
Archived in project
Development

No branches or pull requests

4 participants