Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Dec 7, 2022

What changes were proposed in this pull request?

1, Implement Column.{when, otherwise} and Function when

Why are the changes needed?

For API coverage

Does this PR introduce any user-facing change?

yes

How was this patch tested?

added UT

@zhengruifeng zhengruifeng force-pushed the connect_column_case_when_with_function branch from 6d6b17c to 467b524 Compare December 7, 2022 10:15
@zhengruifeng
Copy link
Contributor Author

it keep failing in the invocation of constructor, I print the constructor as well as expressions in the error message, and they looks fine

grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.UNKNOWN
        details = "[FAILED_FUNCTION_CALL] Failed preparing of the function `when_1, constructor:public org_apache_spark_sql_catalyst_expressions_CaseWhen(scala_collection_Seq), exprs:class org_apache_spark_sql_catalyst_expressions_EqualTo:(cast(a#6 as bigint) = 0) , class org_apache_spark_sql_catalyst_expressions_Literal:1_0 , class org_apache_spark_sql_catalyst_expressions_Literal:2_0, error:null` for call. Please, double check function's arguments."
        debug_error_string = "{"created":"@1670408082.289473000","description":"Error received from peer ipv6:[::1]:15002","file":"src/core/lib/surface/call.cc","file_line":1064,"grpc_message":"[FAILED_FUNCTION_CALL] Failed preparing of the function `when_1, constructor:public org_apache_spark_sql_catalyst_expressions_CaseWhen(scala_collection_Seq), exprs:class org_apache_spark_sql_catalyst_expressions_EqualTo:(cast(a#6 as bigint) = 0) , class org_apache_spark_sql_catalyst_expressions_Literal:1_0 , class org_apache_spark_sql_catalyst_expressions_Literal:2_0, error:null` for call. Please, double check function's arguments.","grpc_status":2}"

I also check in the shell:

scala> import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.expressions._

scala> val casewhen = CaseWhen(Seq.empty, None)
casewhen: org.apache.spark.sql.catalyst.expressions.CaseWhen = CASE END

scala> val varargCtor = casewhen.getClass.getConstructors().find(_.getParameterTypes.toSeq == Seq(classOf[Seq[_]]))
varargCtor: Option[java.lang.reflect.Constructor[_]] = Some(public org.apache.spark.sql.catalyst.expressions.CaseWhen(scala.collection.Seq))

scala> val expressions = Seq((col("a").cast("long") === 0).expr, lit(1.0).expr, lit(2.0).expr)
expressions: Seq[org.apache.spark.sql.catalyst.expressions.Expression] = List((cast('a as bigint) = 0), 1.0, 2.0)

scala> varargCtor.get.newInstance(expressions)
res10: Any = CASE WHEN (cast('a as bigint) = 0) THEN 1.0 ELSE 2.0 END

@zhengruifeng
Copy link
Contributor Author

@hvanhovell I add the new constructor, but it kept failing in the invocation of constructor. I manually check the constructor and it seems fine. Do you have any ideas about it?

@hvanhovell
Copy link
Contributor

@zhengruifeng let's check how the constructor resolution works in the FunctionRegistry. Alternatively we can use ExpressionBuilder for when.

@zhengruifeng zhengruifeng force-pushed the connect_column_case_when_with_function branch from 467b524 to 8317aa0 Compare December 13, 2022 11:08
@zhengruifeng zhengruifeng changed the title [DO_NOT_MERGE] Implement Column.{when, otherwise} and Function when with UnresolvedFunction [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function when with UnresolvedFunction Dec 13, 2022
@zhengruifeng zhengruifeng marked this pull request as ready for review December 13, 2022 11:10
@zhengruifeng zhengruifeng changed the title [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function when with UnresolvedFunction [SPARK-41319][CONNECT][PYTHON] Implement Column.{when, otherwise} and Function when with UnresolvedFunction Dec 13, 2022
@zhengruifeng
Copy link
Contributor Author

Fixed, the root cause is the previous constructor copied from https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L388-L395 failed due to MatchError ...

@zhengruifeng
Copy link
Contributor Author

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need an additional test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me see

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the CaseWhen.createFromParser? You will need to register an ExpressionBuilder for this. The upside of this is that it is less messy in the constructor, and that there is ample testing for this from the parser.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvanhovell what about we create CaseWhen in the connect planner , then we don't need to change sql side at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to keep the connect planner as generic as possible. If we can use an unresolved function then it is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I mean we still use an unresolved function, and we can call CaseWhen.createFromParser in connect planner

Copy link
Contributor

@grundprinzip grundprinzip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createFromParser was never used

make this change, since it causes MatchError

@zhengruifeng
Copy link
Contributor Author

Copy link
Contributor

@amaliujia amaliujia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fix lint

try

addition constructor

debug

fix constructor

fix lint

build CaseWhen in connect planner
@zhengruifeng zhengruifeng force-pushed the connect_column_case_when_with_function branch from abe49da to 0e9ecb5 Compare December 16, 2022 01:52
@zhengruifeng zhengruifeng deleted the connect_column_case_when_with_function branch December 16, 2022 04:26
@zhengruifeng
Copy link
Contributor Author

merged into master, thanks for the reviews

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants