Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Implement Column.{when, otherwise} and Function when

Why are the changes needed?

For API coverage

Does this PR introduce any user-facing change?

yes

How was this patch tested?

added UT

@zhengruifeng zhengruifeng force-pushed the connect_column_case_when branch 2 times, most recently from 0f0b5c9 to d210b4c Compare December 6, 2022 12:22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[error] /home/runner/work/spark/spark/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:594:13: type mismatch;
[error] found : scala.collection.mutable.Buffer[(org.apache.spark.sql.catalyst.expressions.Expression, org.apache.spark.sql.catalyst.expressions.Expression)]
[error] required: Seq[(org.apache.spark.sql.catalyst.expressions.Expression, org.apache.spark.sql.catalyst.expressions.Expression)]
[error] .map(b => (transformExpression(b.getCondition), transformExpression(b.getValue))),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guessing you need a .toSeq?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this an unresolved_function? All it takes is that we add a constructor to CaseWhen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you can rewrite this into a series of ifs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm for unresolved_function, how to deal with the optional else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the the number of expressions is uneven, the last expression is the else expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use this way then have to document it clear for even/odd assumption on the server side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we do. I am just trying to see if we can keep the number of expressions in the proto to a bare minimum. In this case you can get away with it because it sort of makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether unresolved_function when can support otherwise, let me take a deeper look

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise is the elseValue in CaseWhen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, let me have a try, thanks

@zhengruifeng zhengruifeng force-pushed the connect_column_case_when branch from d210b4c to 110d7e7 Compare December 8, 2022 09:34
@zhengruifeng
Copy link
Contributor Author

close this one in favor of #38956

@zhengruifeng zhengruifeng deleted the connect_column_case_when branch December 16, 2022 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants