Skip to content

Commit 291647d

Browse files
vladimirg-dbcloud-fan
authored andcommitted
[SPARK-49250][SQL] Improve error message for nested UnresolvedWindowExpression in CheckAnalysis
### What changes were proposed in this pull request? When `CheckAnalysis` encounters `UnresolvedWindowExpression` in `Project` or `Aggregate`, it throws `QueryCompilationErrors.windowSpecificationNotDefinedError`: - 4718d59c6c4 - 0b48d3f61b7 However, consider the following query: `SELECT (SUM(col1) OVER(unspecified_window) / 1) FROM VALUES (1)` Here `UnreolvedWindowExpression` is wrapped into the division expression. And `CheckAnalysis` throws a different unrelated `org.apache.spark.sql.catalyst.analysis.UnresolvedException: [INTERNAL_ERROR] Invalid call to dataType on unresolved object SQLSTATE: XX000` exception earlier from this [case](https://github.com/apache/spark/blob/0b48d3f61b726209e96b0b967530534b5ad9101d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L308). The solution to improve this would be to match `UnreolvedWindowExpression` early. ### Why are the changes needed? To improve error message for incorrect window usage. ### Does this PR introduce _any_ user-facing change? Yes, the error message is better now ### How was this patch tested? Added a unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes #47775 from vladimirg-db/vladimirg-db/better-error-message-for-unresolved-window-expression-in-check-analysis. Authored-by: Vladimir Golubev <vladimir.golubev@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent e63c398 commit 291647d

File tree

2 files changed

+27
-0
lines changed

2 files changed

+27
-0
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,11 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
305305
throw QueryCompilationErrors.invalidStarUsageError(operator.nodeName, Seq(s))
306306
}
307307

308+
// Should be before `e.checkInputDataTypes()` to produce the correct error for unknown
309+
// window expressions nested inside other expressions
310+
case UnresolvedWindowExpression(_, WindowSpecReference(windowName)) =>
311+
throw QueryCompilationErrors.windowSpecificationNotDefinedError(windowName)
312+
308313
case e: Expression if e.checkInputDataTypes().isFailure =>
309314
e.checkInputDataTypes() match {
310315
case checkRes: TypeCheckResult.DataTypeMismatch =>

sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4887,6 +4887,28 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
48874887
assert(relations.head.options == Map("key1" -> "1", "key2" -> "2"))
48884888
}
48894889
}
4890+
4891+
test(
4892+
"SPARK-49250: CheckAnalysis for UnresolvedWindowExpression must produce " +
4893+
"MISSING_WINDOW_SPECIFICATION error"
4894+
) {
4895+
for (sqlText <- Seq(
4896+
"SELECT SUM(col1) OVER(unspecified_window) FROM VALUES (1)",
4897+
"SELECT SUM(col1) OVER(unspecified_window) FROM VALUES (1) GROUP BY col1",
4898+
"SELECT (SUM(col1) OVER(unspecified_window) / 1) FROM VALUES (1)"
4899+
)) {
4900+
checkError(
4901+
exception = intercept[AnalysisException](
4902+
sql(sqlText)
4903+
),
4904+
errorClass = "MISSING_WINDOW_SPECIFICATION",
4905+
parameters = Map(
4906+
"windowName" -> "unspecified_window",
4907+
"docroot" -> SPARK_DOC_ROOT
4908+
)
4909+
)
4910+
}
4911+
}
48904912
}
48914913

48924914
case class Foo(bar: Option[String])

0 commit comments

Comments
 (0)