Skip to content

[SPARK-53273][CONNECT][SQL] Make RegisterUserDefinedFunction in SparkConnectPlanner side effect free #52026

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

heyihong
Copy link
Contributor

@heyihong heyihong commented Aug 14, 2025

What changes were proposed in this pull request?

This PR refactors the RegisterUserDefinedFunction handling in SparkConnectPlanner to make it side effect free by converting it from a direct execution approach to a logical plan transformation approach. The key changes include:

  1. Created new command classes for UDF registration:

    • RegisterJavaUDAFCommand - for Java UDAF registration
    • RegisterJavaUDFCommand - for Java UDF registration
    • RegisterPythonUDFCommand - for Python UDF registration
    • RegisterScalaUDFCommand - for Scala UDF registration
  2. Refactored SparkConnectPlanner:

    • Modified transformCommand() to handle REGISTER_FUNCTION case and return a logical plan transformer
    • Replaced handleRegisterUserDefinedFunction() with transformRegisterUserDefinedFunction() that returns LogicalPlan instead of executing side effects
    • Updated the processing flow to use the new command-based approach
    • Removed direct UDF registration calls that had side effects

Why are the changes needed?

The current implementation of RegisterUserDefinedFunction in SparkConnectPlanner directly executes side effects during the planning phase, which violates the principle that planners should be side effect free. This makes the code harder to test, reason about, and maintain.

Does this PR introduce any user-facing change?

No - This is an internal refactoring that maintains the same external behavior. Users will not see any changes in how UDF registration works from their perspective.

How was this patch tested?

build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite"

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.4.3

@heyihong
Copy link
Contributor Author

@heyihong heyihong changed the title [SPARK-53273][CONNECT] Make RegisterUserDefinedFunction in SparkConnectPlanner side effect free [SPARK-53273][CONNECT][SQL] Make RegisterUserDefinedFunction in SparkConnectPlanner side effect free Aug 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant