Skip to content

Commit 37d191e

Browse files
committed
[MINIOR][PYTHON] Specifying udf type in SCHEMA_MISMATCH_FOR_PANDAS_UDF error message
### What changes were proposed in this pull request? This minor patch adds `udf_type` parameter to `SCHEMA_MISMATCH_FOR_PANDAS_UDF` error message. ### Why are the changes needed? We actually raise `SCHEMA_MISMATCH_FOR_PANDAS_UDF` message for both Pandas UDF and Arrow batch UDF. But the error message claims `pandas_udf` has error. It might be confused to Arrow batch UDF when seeing it. ### Does this PR introduce _any_ user-facing change? Yes, the error message to user is changed. ### How was this patch tested? Unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50312 from viirya/fix_udf_error_message. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
1 parent ee3971f commit 37d191e

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

python/pyspark/errors/error-conditions.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -905,7 +905,7 @@
905905
},
906906
"SCHEMA_MISMATCH_FOR_PANDAS_UDF": {
907907
"message": [
908-
"Result vector from pandas_udf was not the required length: expected <expected>, got <actual>."
908+
"Result vector from <udf_type> was not the required length: expected <expected>, got <actual>."
909909
]
910910
},
911911
"SESSION_ALREADY_EXIST": {

python/pyspark/worker.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,7 @@ def verify_result_length(result, length):
141141
raise PySparkRuntimeError(
142142
errorClass="SCHEMA_MISMATCH_FOR_PANDAS_UDF",
143143
messageParameters={
144+
"udf_type": "pandas_udf",
144145
"expected": str(length),
145146
"actual": str(len(result)),
146147
},
@@ -213,6 +214,7 @@ def verify_result_length(result, length):
213214
raise PySparkRuntimeError(
214215
errorClass="SCHEMA_MISMATCH_FOR_PANDAS_UDF",
215216
messageParameters={
217+
"udf_type": "arrow_batch_udf",
216218
"expected": str(length),
217219
"actual": str(len(result)),
218220
},

0 commit comments

Comments
 (0)