Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map_elements sometimes returns None instead of null #17936

Open
2 tasks done
anergictcell opened this issue Jul 30, 2024 · 2 comments
Open
2 tasks done

map_elements sometimes returns None instead of null #17936

anergictcell opened this issue Jul 30, 2024 · 2 comments
Labels
A-panic Area: code that results in panic exceptions bug Something isn't working P-low Priority: low python Related to Python Polars

Comments

@anergictcell
Copy link
Contributor

anergictcell commented Jul 30, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

class Custom:
    value: int
    def __init__(self, value: int):
        self.value = value
    def __str__(self):
        return f"Custom[{self.value}]"#.format(self.value)
    def __repr__(self):
        return str(self)

def mapper(value: int) -> Custom:
    if value == 2:
        return None
    return Custom(value)

df = pl.DataFrame({"a": [1,2,3]})

df.with_columns(
    pl.col("a").map_elements(mapper, return_dtype=pl.Object).alias("with_dtype"),
    pl.col("a").map_elements(mapper).alias("without_dtype"),
)



shape: (3, 3)
┌─────┬────────────┬───────────────┐
│ awith_dtypewithout_dtype │
│ ---------           │
│ i64objectobject        │
╞═════╪════════════╪═══════════════╡
│ 1Custom[1]  ┆ Custom[1]     │
│ 2Nonenull          │
│ 3Custom[3]  ┆ Custom[3]     │
└─────┴────────────┴───────────────┘

Log output

-

Issue description

map_elements sometimes does not convert None to null if the function returns a custom Python object.

A Python function that returns either a custom Python object or None is not handled correctly by map_elements. It does not convert the returning None to polars null, but instead stores None as an object.

This does only happen when the return_dtype=pl.Object argument is present. When polars guesses the return type, it does handle the None -> null conversion correctly.

Expected behavior

The None values should be stored in the dataframe as null

Installed versions

-------Version info---------
Polars: 1.3.0
Index type: UInt32
Platform: macOS-14.5-arm64-arm-64bit
Python: 3.12.4 (main, Jun 6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)]

----Optional dependencies----
adbc_driver_manager:
cloudpickle:
connectorx:
deltalake:
fastexcel:
fsspec:
gevent:
great_tables:
hvplot:
matplotlib: 3.9.1
nest_asyncio: 1.6.0
numpy: 2.0.1
openpyxl:
pandas:
pyarrow:
pydantic:
pyiceberg:
sqlalchemy:
torch:
xlsx2csv:
xlsxwriter:

@anergictcell anergictcell added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Jul 30, 2024
@deanm0000 deanm0000 added P-low Priority: low A-panic Area: code that results in panic exceptions and removed needs triage Awaiting prioritization by a maintainer labels Jul 31, 2024
@deanm0000
Copy link
Collaborator

I added panic b/c if you do

df.with_columns(
    with_dtype=pl.when(pl.col('with_dtype').map_elements(lambda x: x is not None, return_dtype=pl.Boolean)).then('with_dtype')
)

then it Panics

@philiporlando
Copy link

What happens if you set skip_nulls=False within map_elements()? I have been encountering similar issues and setting this param seems to be working? Still trying to make sense of it all though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-panic Area: code that results in panic exceptions bug Something isn't working P-low Priority: low python Related to Python Polars
Projects
Status: Ready
Development

No branches or pull requests

3 participants