Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support IGNORE NULLS for NTH_VALUE window function #9625

Merged
merged 1 commit into from
Mar 18, 2024

Conversation

huaxingao
Copy link
Contributor

Which issue does this PR close?

Closes #.
Related #9055

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt) labels Mar 15, 2024
@@ -249,7 +249,6 @@ fn create_built_in_window_expr(
.clone()
.try_into()
.map_err(|e| DataFusionError::Execution(format!("{e:?}")))?;
let n: u32 = n as u32;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we want to support negative index in NTH_VALUE or not. If we want to support, then this line needs to be removed. Otherwise, we probably want to add a check to explicitly block negative index.

Copy link
Contributor

@mustafasrepo mustafasrepo Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently negative index is not supported from an sql query. This means that we reject following kind of queries

SELECT nth_VALUE(column1, -2) OVER(ORDER BY column2 DESC  rows between unbounded preceding and current row) FROM t;

This behaviour is similar to Postgres behaviour. However, internally we support negative index. Internal support enables us to rewrite some window expressions by taking their reverse if it helps to remove a SortEXec from the plan.
Consider following query

SELECT nth_VALUE(column1, 2) OVER(ORDER BY column2 DESC  rows between unbounded preceding and current row) FROM t;

where it is known that table t is ordered by column2 ASC. We can treat query above as below internally

SELECT nth_VALUE(column1, -2) OVER(ORDER BY column2 ASC  rows between current row and unbounded following) FROM t;

which is the same in terms of end result.
In the second version, the ordering requirement for the window function is already satisfied.
In short, we have internal mechanism, to handle negative index. However, it is not exposed to outside users since it is not standart. However, I think there is no harm in having this support. If others agree, we can continue with the changes in this PR.

Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is LGTM!. Thanks @huaxingao for this PR.

@Dandandan Dandandan merged commit 40bf0ea into apache:main Mar 18, 2024
23 checks passed
@huaxingao
Copy link
Contributor Author

Thanks a lot @Dandandan @mustafasrepo for reviewing and merging the PR!

@huaxingao huaxingao deleted the nth_value branch March 18, 2024 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants