Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49902][SQL] Catch underlying runtime errors in RegExpReplace #48379

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

harshmotw-db
Copy link
Contributor

What changes were proposed in this pull request?

Earlier, runtime errors in underlying libraries were not caught during runtime in the RegExpReplace expression. The underlying errors were thrown directly to the user. For example, it wouldn't be uncommon to see issues like java.lang.IndexOutOfBoundsException: No group 3. This PR introduces a change to catch these underlying issues and throw a SparkException instead which details the input on which the exception failed. The new Spark Exception looks something like org.apache.spark.SparkException: Could not perform regexp_replace for source = <source>, pattern = <pattern>, replacement = <replacement> and position = <position>.

Why are the changes needed?

Two reasons. First, the new exception details which row the given error occurred on, which makes it easier for the user to debug the query or Spark developers to identify bugs. Second, a Spark Exception is generally considered expected behavior indicating that there were no unintended issues in the query's execution.

Does this PR introduce any user-facing change?

Yes, a better exception is thrown when RegExpReplace fails.

How was this patch tested?

Unit test in both codegen as well as interpreted mode.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot removed the CONNECT label Oct 8, 2024
@harshmotw-db harshmotw-db marked this pull request as ready for review October 8, 2024 02:07
@harshmotw-db
Copy link
Contributor Author

@cloud-fan Can you look at this PR? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants