Skip to content

[SPARK-28962][SQL][FOLLOW-UP] Add the parameter description for the Scala function API filter #27336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions sql/core/src/main/scala/org/apache/spark/sql/functions.scala
Original file line number Diff line number Diff line change
Expand Up @@ -3455,6 +3455,13 @@ object functions {

/**
* Returns an array of elements for which a predicate holds in a given array.
* {{{
* df.select(filter(col("s"), x => x % 2 === 0))
* df.selectExpr("filter(col, x -> x % 2 == 0)")
* }}}
*
* @param column: the input array column
* @param f: col => predicate, the boolean predicate to filter the input column
*
* @group collection_funcs
* @since 3.0.0
Expand All @@ -3465,6 +3472,14 @@ object functions {

/**
* Returns an array of elements for which a predicate holds in a given array.
* {{{
* df.select(filter(col("s"), (x, i) => i % 2 === 0))
* df.selectExpr("filter(col, (x, i) -> i % 2 == 0)")
* }}}
*
* @param column: the input array column
* @param f: (col, index) => predicate, the boolean predicate to filter the input column
* given the index. Indices start at 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this consistent within Spark that the indices parameter starts with 0 in higher-order functions? @ueshin @HyukjinKwon

Copy link
Member

@ueshin ueshin Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayTransform with index argument starts with 0.
We might need to change it from 1 (with a legacy config and migration guide)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the behavior of presto?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually presto's transform or filter don't take index argument.

I remember we had a discussion about the index argument in the PR for zip_with_index (#21121 (comment)).
And for filter, it was done later separately, but seems like the similar context https://issues.apache.org/jira/browse/SPARK-28962.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too late to change now, let's keep using 0.

*
* @group collection_funcs
* @since 3.0.0
Expand Down