Commit 07cbba6
[SPARK-48706][PYTHON] Python UDF in higher order functions should not throw internal error
### What changes were proposed in this pull request?
This PR fixes the error messages and classes when Python UDFs are used in higher order functions.
### Why are the changes needed?
To show the proper user-facing exceptions with error classes.
### Does this PR introduce _any_ user-facing change?
Yes, previously it threw internal error such as:
```python
from pyspark.sql.functions import transform, udf, col, array
spark.range(1).select(transform(array("id"), lambda x: udf(lambda y: y)(x))).collect()
```
Before:
```
py4j.protocol.Py4JJavaError: An error occurred while calling o74.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 15 in stage 0.0 failed 1 times, most recent failure: Lost task 15.0 in stage 0.0 (TID 15) (ip-192-168-123-103.ap-northeast-2.compute.internal executor driver): org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot evaluate expression: <lambda>(lambda x_0#3L)#2 SQLSTATE: XX000
at org.apache.spark.SparkException$.internalError(SparkException.scala:92)
at org.apache.spark.SparkException$.internalError(SparkException.scala:96)
```
After:
```
pyspark.errors.exceptions.captured.AnalysisException: [INVALID_LAMBDA_FUNCTION_CALL.UNEVALUABLE] Invalid lambda function call. Python UDFs should be used in a lambda function at a higher order function. However, "<lambda>(lambda x_0#3L)" was a Python UDF. SQLSTATE: 42K0D;
Project [transform(array(id#0L), lambdafunction(<lambda>(lambda x_0#3L)#2, lambda x_0#3L, false)) AS transform(array(id), lambdafunction(<lambda>(lambda x_0#3L), namedlambdavariable()))#4]
+- Range (0, 1, step=1, splits=Some(16))
```
### How was this patch tested?
Unittest was added
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes apache#47079 from HyukjinKwon/SPARK-48706.
Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>1 parent 169346c commit 07cbba6
File tree
3 files changed
+27
-2
lines changed- common/utils/src/main/resources/error
- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis
- core/src/test/scala/org/apache/spark/sql/execution/python
3 files changed
+27
-2
lines changedLines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4482 | 4482 | | |
4483 | 4483 | | |
4484 | 4484 | | |
| 4485 | + | |
| 4486 | + | |
| 4487 | + | |
| 4488 | + | |
| 4489 | + | |
4485 | 4490 | | |
4486 | 4491 | | |
4487 | 4492 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
254 | 254 | | |
255 | 255 | | |
256 | 256 | | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
257 | 265 | | |
258 | 266 | | |
259 | 267 | | |
| |||
Lines changed: 14 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | | - | |
| 20 | + | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
115 | 127 | | |
0 commit comments