Skip to content

Commit 1de272f

Browse files
yangjiemaropu
yangjie
authored andcommitted
[SPARK-32762][SQL][TEST] Enhance the verification of ExpressionsSchemaSuite to sql-expression-schema.md
### What changes were proposed in this pull request? `sql-expression-schema.md` automatically generated by `ExpressionsSchemaSuite`, but only expressions entries are checked in `ExpressionsSchemaSuite`. So if we manually modify the contents of the file,  `ExpressionsSchemaSuite` does not necessarily guarantee the correctness of the it some times. For example, [Spark-24884](#27507) added `regexp_extract_all` expression support, and manually modify the `sql-expression-schema.md` but not change the content of `Number of queries` cause file content inconsistency. Some additional checks have been added to `ExpressionsSchemaSuite` to improve the correctness guarantee of `sql-expression-schema.md` as follow: - `Number of queries` should equals size of `expressions entries` in `sql-expression-schema.md` - `Number of expressions that missing example` should equals size of `Expressions missing examples` in `sql-expression-schema.md` - `MissExamples` from case should same as `expectedMissingExamples` from `sql-expression-schema.md` ### Why are the changes needed? Ensure the correctness of `sql-expression-schema.md` content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Enhanced ExpressionsSchemaSuite Closes #29608 from LuciferYang/sql-expression-schema. Authored-by: yangjie <yangjie@MacintoshdeMacBook-Pro.local> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
1 parent f1f7ae4 commit 1de272f

File tree

2 files changed

+27
-4
lines changed

2 files changed

+27
-4
lines changed

sql/core/src/test/resources/sql-functions/sql-expression-schema.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
<!-- Automatically generated by ExpressionsSchemaSuite -->
22
## Summary
3-
- Number of queries: 338
3+
- Number of queries: 339
44
- Number of expressions that missing example: 34
55
- Expressions missing examples: and,string,tinyint,double,smallint,date,decimal,boolean,float,binary,bigint,int,timestamp,struct,cume_dist,dense_rank,input_file_block_length,input_file_block_start,input_file_name,lag,lead,monotonically_increasing_id,ntile,!,not,or,percent_rank,rank,row_number,spark_partition_id,version,window,positive,count_min_sketch
66
## Schema of Built-in Functions

sql/core/src/test/scala/org/apache/spark/sql/ExpressionsSchemaSuite.scala

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -152,23 +152,38 @@ class ExpressionsSchemaSuite extends QueryTest with SharedSparkSession {
152152

153153
val outputSize = outputs.size
154154
val headerSize = header.size
155-
val expectedOutputs: Seq[QueryOutput] = {
155+
val (expectedMissingExamples, expectedOutputs) = {
156156
val expectedGoldenOutput = fileToString(resultFile)
157157
val lines = expectedGoldenOutput.split("\n")
158158
val expectedSize = lines.size
159159

160160
assert(expectedSize == outputSize + headerSize,
161161
s"Expected $expectedSize blocks in result file but got " +
162-
s"${outputSize + headerSize}. Try regenerate the result files.")
162+
s"${outputSize + headerSize}. Try regenerating the result files.")
163163

164-
Seq.tabulate(outputSize) { i =>
164+
val numberOfQueries = lines(2).split(":")(1).trim.toInt
165+
val expectedOutputs = Seq.tabulate(outputSize) { i =>
165166
val segments = lines(i + headerSize).split('|')
166167
QueryOutput(
167168
className = segments(1).trim,
168169
funcName = segments(2).trim,
169170
sql = segments(3).trim,
170171
schema = segments(4).trim)
171172
}
173+
174+
assert(numberOfQueries == expectedOutputs.size,
175+
s"expected outputs size: ${expectedOutputs.size} not same as numberOfQueries: " +
176+
s"$numberOfQueries record in result file. Try regenerating the result files.")
177+
178+
val numberOfMissingExamples = lines(3).split(":")(1).trim.toInt
179+
val expectedMissingExamples = lines(4).split(":")(1).trim.split(",")
180+
181+
assert(numberOfMissingExamples == expectedMissingExamples.size,
182+
s"expected missing examples size: ${expectedMissingExamples.size} not same as " +
183+
s"numberOfMissingExamples: $numberOfMissingExamples " +
184+
"record in result file. Try regenerating the result files.")
185+
186+
(expectedMissingExamples, expectedOutputs)
172187
}
173188

174189
// Compare results.
@@ -179,5 +194,13 @@ class ExpressionsSchemaSuite extends QueryTest with SharedSparkSession {
179194
assert(expected.sql == output.sql, "SQL query did not match")
180195
assert(expected.schema == output.schema, s"Schema did not match for query ${expected.sql}")
181196
}
197+
198+
// Compare expressions missing examples
199+
assert(expectedMissingExamples.length == missingExamples.size,
200+
"The number of missing examples not equals the number of expected missing examples.")
201+
202+
missingExamples.zip(expectedMissingExamples).foreach { case (output, expected) =>
203+
assert(expected == output, "Missing example expression not match")
204+
}
182205
}
183206
}

0 commit comments

Comments
 (0)