Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting 1.7.0, calling Expr.str.replace on a Int64 column raises #19047

Closed
2 tasks done
BartSchuurmans opened this issue Oct 1, 2024 · 2 comments
Closed
2 tasks done
Labels
bug Something isn't working python Related to Python Polars

Comments

@BartSchuurmans
Copy link
Contributor

BartSchuurmans commented Oct 1, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

pl.select(pl.lit(1, dtype=pl.Int64).str.replace(",", "."))

Polars 1.6.0:

shape: (1, 1)
┌─────────┐
│ literal │
│ ---     │
│ str     │
╞═════════╡
│ 1       │
└─────────┘

Polars 1.7.0:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bart/src/proj/.venv/lib/python3.12/site-packages/polars/functions/lazy.py", line 1908, in select
    return pl.DataFrame().select(*exprs, **named_exprs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bart/src/proj/.venv/lib/python3.12/site-packages/polars/dataframe/frame.py", line 8981, in select
    return self.lazy().select(*exprs, **named_exprs).collect(_eager=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bart/src/proj/.venv/lib/python3.12/site-packages/polars/lazyframe/frame.py", line 2053, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.InvalidOperationError: expected String type, got: i64

Log output

No response

Issue description

Starting 1.7.0, calling Expr.str.replace on a Int64 column raises instead of silently casting to String.

The i64 column type was inferred by pl.read_csv(), which is why I didn't expect it to be an i64 but a str.

Expected behavior

Current behavior might be correct, I just can't find the change in the release notes.

Installed versions

--------Version info---------
Polars:              1.8.2
Index type:          UInt32
Platform:            Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python:              3.12.5 (main, Aug 17 2024, 16:46:05) [GCC 11.4.0]

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         1.6.0
numpy                2.1.0
openpyxl             <not installed>
pandas               2.2.2
pyarrow              16.1.0
pydantic             2.8.2
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>```

</details>
@BartSchuurmans BartSchuurmans added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Oct 1, 2024
@stinodego
Copy link
Member

This is indeed expected behavior. You would have to cast your column to a String before using str.replace, or provide schema_overrides to read_csv.

Not sure which PR introduced the change in behavior that affected you, but we make correctness improvements all the time. This would fall into that category.

@stinodego stinodego closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024
@cmdlineluser
Copy link
Contributor

The PR, if interested:

@stinodego stinodego removed the needs triage Awaiting prioritization by a maintainer label Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants