Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python): DataFrame descending sorting by single list element #19233

Merged
merged 8 commits into from
Oct 18, 2024

Conversation

khalidmammadov
Copy link
Contributor

@khalidmammadov khalidmammadov commented Oct 14, 2024

When calling sort with single str column and list of single descending parameter it fails with error:

ldf = pl.DataFrame({
    "age": [2, 5, 1],
    "names": ["Alice", "Cob", "Bob"]
}).lazy()

sorted_df = ldf.sort("age", descending=[False])
print(sorted_df)
  File "..../site-packages/polars/lazyframe/frame.py", line 1378, in sort
    self._ldf.sort(
TypeError: argument 'descending': 'list' object cannot be converted to 'PyBool'

It checks for types and avoids path that does not use list.

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Oct 14, 2024
Copy link

codecov bot commented Oct 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.96%. Comparing base (d52e13e) to head (e7ee26f).
Report is 14 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19233      +/-   ##
==========================================
- Coverage   80.00%   79.96%   -0.04%     
==========================================
  Files        1527     1529       +2     
  Lines      209203   209820     +617     
  Branches     2415     2416       +1     
==========================================
+ Hits       167371   167783     +412     
- Misses      41284    41488     +204     
- Partials      548      549       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -1370,6 +1370,8 @@ def sort(
"""
# Fast path for sorting by a single existing column
if isinstance(by, str) and not more_by:
if isinstance(descending, list):
descending = descending[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check and raise an error if the list length is not 1

@ritchie46
Copy link
Member

I am not sure here. This will be a huge combinatorial nightmare checking this for the whole API. If you pass a single &str value for column, you should also pass a single bool value for sortedness.

So it's either all lists or all scalars.

@orlp
Copy link
Collaborator

orlp commented Oct 16, 2024

Can't we fix this in some central location?

@khalidmammadov
Copy link
Contributor Author

khalidmammadov commented Oct 16, 2024

The way I see it, there are two branches in the body of the func and signature does not convey which permutations are allowed and which not.
PR aims to help user to be more permissive and honor the signature.
I have simplified the code to avoid isinstances to follow more Pythonic principles. Please take a look and let me know what you think.

@@ -1370,6 +1371,8 @@ def sort(
"""
# Fast path for sorting by a single existing column
if isinstance(by, str) and not more_by:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just not hit the fast path if the user passes a list:

if isinstance(by, str) and isintance(descending, bool) and not more_by:

@@ -629,3 +629,10 @@ def re_escape(s: str) -> str:
# escapes _only_ those metachars with meaning to the rust regex crate
re_rust_metachars = r"\\?()|\[\]{}^$#&~.+*-"
return re.sub(f"([{re_rust_metachars}])", r"\\\1", s)


def try_head(seq: Sequence[Any] | Any, default: Any) -> Any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't per se correct as the lenght is not checked.

@khalidmammadov
Copy link
Contributor Author

fixed

Copy link
Contributor

@eitsupi eitsupi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the title should be fix(python): ...

@khalidmammadov khalidmammadov changed the title fix: DataFrame descending sorting by single list element fix(python): DataFrame descending sorting by single list element Oct 18, 2024
@ritchie46 ritchie46 merged commit da8e37a into pola-rs:main Oct 18, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants