Skip to content

Commit 11b07e0

Browse files
committed
[SPARK-28344][SQL] detect ambiguous self-join and fail the query
This is an alternative solution of apache#24442 . It fails the query if ambiguous self join is detected, instead of trying to disambiguate it. The problem is that, it's hard to come up with a reasonable rule to disambiguate, the rule proposed by apache#24442 is mostly a heuristic. This is a long-standing bug and I've seen many people complaining about it in JIRA/dev list. A typical example: ``` val df1 = … val df2 = df1.filter(...) df1.join(df2, df1("a") > df2("a")) // returns empty result ``` The root cause is, `Dataset.apply` is so powerful that users think it returns a column reference which can point to the column of the Dataset at anywhere. This is not true in many cases. `Dataset.apply` returns an `AttributeReference` . Different Datasets may share the same `AttributeReference`. In the example above, `df2` adds a Filter operator above the logical plan of `df1`, and the Filter operator reserves the output `AttributeReference` of its child. This means, `df1("a")` is exactly the same as `df2("a")`, and `df1("a") > df2("a")` always evaluates to false. We can reuse the infra in apache#24442 : 1. each Dataset has a globally unique id. 2. the `AttributeReference` returned by `Dataset.apply` carries the ID and column position(e.g. 3rd column of the Dataset) via metadata. 3. the logical plan of a `Dataset` carries the ID via `TreeNodeTag` When self-join happens, the analyzer asks the right side plan of join to re-generate output attributes with new exprIds. Based on it, a simple rule to detect ambiguous self join is: 1. find all column references (i.e. `AttributeReference`s with Dataset ID and col position) in the root node of a query plan. 2. for each column reference, traverse the query plan tree, find a sub-plan that carries Dataset ID and the ID is the same as the one in the column reference. 3. get the corresponding output attribute of the sub-plan by the col position in the column reference. 4. if the corresponding output attribute has a different exprID than the column reference, then it means this sub-plan is on the right side of a self-join and has regenerated its output attributes. This is an ambiguous self join because the column reference points to a table being self-joined. existing tests and new test cases Closes apache#25107 from cloud-fan/new-self-join. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 7cfac94 commit 11b07e0

File tree

12 files changed

+455
-101
lines changed

12 files changed

+455
-101
lines changed

docs/sql-migration-guide-upgrade.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,13 @@ displayTitle: Spark SQL Upgrading Guide
1616

1717
- Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original permission and ACLs during re-creating the table/partition paths. To restore the behaviour of earlier versions, set `spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`.
1818

19-
- Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` configuration is added in order to support the legacy MsSQLServer dialect mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC types, respectively. To restore the behaviour of 2.4.3 and earlier versions, set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`.
19+
- Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` configuration is added in order to support the legacy MsSQLServer dialect mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC types, respectively. To restore the behaviour of 2.4.3 and earlier versions, set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`.
20+
21+
- Since Spark 2.4.5, Dataset query fails if it contains ambiguous column reference that is caused by self join. A typical example: `val df1 = ...; val df2 = df1.filter(...);`, then `df1.join(df2, df1("a") > df2("a"))` returns an empty result which is quite confusing. This is because Spark cannot resolve Dataset column references that point to tables being self joined, and `df1("a")` is exactly the same as `df2("a")` in Spark. To restore the behavior before Spark 3.0, you can set `spark.sql.analyzer.failAmbiguousSelfJoin` to `false`.
2022

2123
## Upgrading from Spark SQL 2.4.3 to 2.4.4
2224

23-
- Since Spark 2.4.4, according to [MsSqlServer Guide](https://docs.microsoft.com/en-us/sql/connect/jdbc/using-basic-data-types?view=sql-server-2017), MsSQLServer JDBC Dialect uses ShortType and FloatType for SMALLINT and REAL, respectively. Previously, IntegerType and DoubleType is used.
25+
- Since Spark 2.4.4, according to [MsSqlServer Guide](https://docs.microsoft.com/en-us/sql/connect/jdbc/using-basic-data-types?view=sql-server-2017), MsSQLServer JDBC Dialect uses ShortType and FloatType for SMALLINT and REAL, respectively. Previously, IntegerType and DoubleType is used.
2426

2527
## Upgrading from Spark SQL 2.4 to 2.4.1
2628

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,8 @@ class Analyzer(
334334
gid: Expression): Expression = {
335335
expr transform {
336336
case e: GroupingID =>
337-
if (e.groupByExprs.isEmpty || e.groupByExprs == groupByExprs) {
337+
if (e.groupByExprs.isEmpty ||
338+
e.groupByExprs.map(_.canonicalized) == groupByExprs.map(_.canonicalized)) {
338339
Alias(gid, toPrettySQL(e))()
339340
} else {
340341
throw new AnalysisException(
@@ -936,6 +937,8 @@ class Analyzer(
936937
// To resolve duplicate expression IDs for Join and Intersect
937938
case j @ Join(left, right, _, _) if !j.duplicateResolved =>
938939
j.copy(right = dedupRight(left, right))
940+
// intersect/except will be rewritten to join at the begininng of optimizer. Here we need to
941+
// deduplicate the right side plan, so that we won't produce an invalid self-join later.
939942
case i @ Intersect(left, right, _) if !i.duplicateResolved =>
940943
i.copy(right = dedupRight(left, right))
941944
case e @ Except(left, right, _) if !e.duplicateResolved =>

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -726,6 +726,13 @@ object SQLConf {
726726
.booleanConf
727727
.createWithDefault(true)
728728

729+
val FAIL_AMBIGUOUS_SELF_JOIN =
730+
buildConf("spark.sql.analyzer.failAmbiguousSelfJoin")
731+
.doc("When true, fail the Dataset query if it contains ambiguous self-join.")
732+
.internal()
733+
.booleanConf
734+
.createWithDefault(true)
735+
729736
// Whether to retain group by columns or not in GroupedData.agg.
730737
val DATAFRAME_RETAIN_GROUP_COLUMNS = buildConf("spark.sql.retainGroupColumns")
731738
.internal()

sql/core/src/main/scala/org/apache/spark/sql/Column.scala

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,15 @@ private[sql] object Column {
4848
case expr => toPrettySQL(expr)
4949
}
5050
}
51+
52+
private[sql] def stripColumnReferenceMetadata(a: AttributeReference): AttributeReference = {
53+
val metadataWithoutId = new MetadataBuilder()
54+
.withMetadata(a.metadata)
55+
.remove(Dataset.DATASET_ID_KEY)
56+
.remove(Dataset.COL_POS_KEY)
57+
.build()
58+
a.withMetadata(metadataWithoutId)
59+
}
5160
}
5261

5362
/**
@@ -141,11 +150,15 @@ class Column(val expr: Expression) extends Logging {
141150
override def toString: String = toPrettySQL(expr)
142151

143152
override def equals(that: Any): Boolean = that match {
144-
case that: Column => that.expr.equals(this.expr)
153+
case that: Column => that.normalizedExpr() == this.normalizedExpr()
145154
case _ => false
146155
}
147156

148-
override def hashCode: Int = this.expr.hashCode()
157+
override def hashCode: Int = this.normalizedExpr().hashCode()
158+
159+
private def normalizedExpr(): Expression = expr transform {
160+
case a: AttributeReference => Column.stripColumnReferenceMetadata(a)
161+
}
149162

150163
/** Creates a column based on the given expression. */
151164
private def withExpr(newExpr: Expression): Column = new Column(newExpr)
@@ -1023,7 +1036,7 @@ class Column(val expr: Expression) extends Logging {
10231036
* @since 2.0.0
10241037
*/
10251038
def name(alias: String): Column = withExpr {
1026-
expr match {
1039+
normalizedExpr() match {
10271040
case ne: NamedExpression => Alias(expr, alias)(explicitMetadata = Some(ne.metadata))
10281041
case other => Alias(other, alias)()
10291042
}

sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,12 +45,14 @@ import org.apache.spark.sql.catalyst.parser.{ParseException, ParserUtils}
4545
import org.apache.spark.sql.catalyst.plans._
4646
import org.apache.spark.sql.catalyst.plans.logical._
4747
import org.apache.spark.sql.catalyst.plans.physical.{Partitioning, PartitioningCollection}
48+
import org.apache.spark.sql.catalyst.trees.TreeNodeTag
4849
import org.apache.spark.sql.execution._
4950
import org.apache.spark.sql.execution.arrow.{ArrowBatchStreamWriter, ArrowConverters}
5051
import org.apache.spark.sql.execution.command._
5152
import org.apache.spark.sql.execution.datasources.LogicalRelation
5253
import org.apache.spark.sql.execution.python.EvaluatePython
5354
import org.apache.spark.sql.execution.stat.StatFunctions
55+
import org.apache.spark.sql.internal.SQLConf
5456
import org.apache.spark.sql.streaming.DataStreamWriter
5557
import org.apache.spark.sql.types._
5658
import org.apache.spark.sql.util.SchemaUtils
@@ -60,6 +62,11 @@ import org.apache.spark.unsafe.types.CalendarInterval
6062
import org.apache.spark.util.Utils
6163

6264
private[sql] object Dataset {
65+
val curId = new java.util.concurrent.atomic.AtomicLong()
66+
val DATASET_ID_KEY = "__dataset_id"
67+
val COL_POS_KEY = "__col_position"
68+
val DATASET_ID_TAG = TreeNodeTag[Long]("dataset_id")
69+
6370
def apply[T: Encoder](sparkSession: SparkSession, logicalPlan: LogicalPlan): Dataset[T] = {
6471
val dataset = new Dataset(sparkSession, logicalPlan, implicitly[Encoder[T]])
6572
// Eagerly bind the encoder so we verify that the encoder matches the underlying
@@ -173,6 +180,9 @@ class Dataset[T] private[sql](
173180
encoder: Encoder[T])
174181
extends Serializable {
175182

183+
// A globally unique id of this Dataset.
184+
private val id = Dataset.curId.getAndIncrement()
185+
176186
queryExecution.assertAnalyzed()
177187

178188
// Note for Spark contributors: if adding or updating any action in `Dataset`, please make sure
@@ -189,14 +199,18 @@ class Dataset[T] private[sql](
189199
@transient private[sql] val logicalPlan: LogicalPlan = {
190200
// For various commands (like DDL) and queries with side effects, we force query execution
191201
// to happen right away to let these side effects take place eagerly.
192-
queryExecution.analyzed match {
202+
val plan = queryExecution.analyzed match {
193203
case c: Command =>
194204
LocalRelation(c.output, withAction("command", queryExecution)(_.executeCollect()))
195205
case u @ Union(children) if children.forall(_.isInstanceOf[Command]) =>
196206
LocalRelation(u.output, withAction("command", queryExecution)(_.executeCollect()))
197207
case _ =>
198208
queryExecution.analyzed
199209
}
210+
if (sparkSession.sessionState.conf.getConf(SQLConf.FAIL_AMBIGUOUS_SELF_JOIN)) {
211+
plan.setTagValue(Dataset.DATASET_ID_TAG, id)
212+
}
213+
plan
200214
}
201215

202216
/**
@@ -1271,11 +1285,29 @@ class Dataset[T] private[sql](
12711285
if (sqlContext.conf.supportQuotedRegexColumnName) {
12721286
colRegex(colName)
12731287
} else {
1274-
val expr = resolve(colName)
1275-
Column(expr)
1288+
Column(addDataFrameIdToCol(resolve(colName)))
12761289
}
12771290
}
12781291

1292+
// Attach the dataset id and column position to the column reference, so that we can detect
1293+
// ambiguous self-join correctly. See the rule `DetectAmbiguousSelfJoin`.
1294+
// This must be called before we return a `Column` that contains `AttributeReference`.
1295+
// Note that, the metadata added here are only avaiable in the analyzer, as the analyzer rule
1296+
// `DetectAmbiguousSelfJoin` will remove it.
1297+
private def addDataFrameIdToCol(expr: NamedExpression): NamedExpression = {
1298+
val newExpr = expr transform {
1299+
case a: AttributeReference
1300+
if sparkSession.sessionState.conf.getConf(SQLConf.FAIL_AMBIGUOUS_SELF_JOIN) =>
1301+
val metadata = new MetadataBuilder()
1302+
.withMetadata(a.metadata)
1303+
.putLong(Dataset.DATASET_ID_KEY, id)
1304+
.putLong(Dataset.COL_POS_KEY, logicalPlan.output.indexWhere(a.semanticEquals))
1305+
.build()
1306+
a.withMetadata(metadata)
1307+
}
1308+
newExpr.asInstanceOf[NamedExpression]
1309+
}
1310+
12791311
/**
12801312
* Selects column based on the column name specified as a regex and returns it as [[Column]].
12811313
* @group untypedrel
@@ -1289,7 +1321,7 @@ class Dataset[T] private[sql](
12891321
case ParserUtils.qualifiedEscapedIdentifier(nameParts, columnNameRegex) =>
12901322
Column(UnresolvedRegex(columnNameRegex, Some(nameParts), caseSensitive))
12911323
case _ =>
1292-
Column(resolve(colName))
1324+
Column(addDataFrameIdToCol(resolve(colName)))
12931325
}
12941326
}
12951327

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
package org.apache.spark.sql.execution.analysis
19+
20+
import scala.collection.mutable
21+
22+
import org.apache.spark.sql.{AnalysisException, Column, Dataset}
23+
import org.apache.spark.sql.catalyst.expressions.{AttributeReference, Cast, Equality, Expression, ExprId}
24+
import org.apache.spark.sql.catalyst.plans.logical.{Join, LogicalPlan}
25+
import org.apache.spark.sql.catalyst.rules.Rule
26+
import org.apache.spark.sql.internal.SQLConf
27+
28+
/**
29+
* Detects ambiguous self-joins, so that we can fail the query instead of returning confusing
30+
* results.
31+
*
32+
* Dataset column reference is simply an [[AttributeReference]] that is returned by `Dataset#col`.
33+
* Most of time we don't need to do anything special, as [[AttributeReference]] can point to
34+
* the column precisely. However, in case of self-join, the analyzer generates
35+
* [[AttributeReference]] with new expr IDs for the right side plan of the join. If the Dataset
36+
* column reference points to a column in the right side plan of a self-join, users will get
37+
* unexpected result because the column reference can't match the newly generated
38+
* [[AttributeReference]].
39+
*
40+
* Note that, this rule removes all the Dataset id related metadata from `AttributeReference`, so
41+
* that they don't exist after analyzer.
42+
*/
43+
class DetectAmbiguousSelfJoin(conf: SQLConf) extends Rule[LogicalPlan] {
44+
45+
// Dataset column reference is an `AttributeReference` with 2 special metadata.
46+
private def isColumnReference(a: AttributeReference): Boolean = {
47+
a.metadata.contains(Dataset.DATASET_ID_KEY) && a.metadata.contains(Dataset.COL_POS_KEY)
48+
}
49+
50+
private case class ColumnReference(datasetId: Long, colPos: Int, exprId: ExprId)
51+
52+
private def toColumnReference(a: AttributeReference): ColumnReference = {
53+
ColumnReference(
54+
a.metadata.getLong(Dataset.DATASET_ID_KEY),
55+
a.metadata.getLong(Dataset.COL_POS_KEY).toInt,
56+
a.exprId)
57+
}
58+
59+
object LogicalPlanWithDatasetId {
60+
def unapply(p: LogicalPlan): Option[(LogicalPlan, Long)] = {
61+
p.getTagValue(Dataset.DATASET_ID_TAG).map(id => p -> id)
62+
}
63+
}
64+
65+
object AttrWithCast {
66+
def unapply(expr: Expression): Option[AttributeReference] = expr match {
67+
case Cast(child, _, _) => unapply(child)
68+
case a: AttributeReference => Some(a)
69+
case _ => None
70+
}
71+
}
72+
73+
override def apply(plan: LogicalPlan): LogicalPlan = {
74+
if (!conf.getConf(SQLConf.FAIL_AMBIGUOUS_SELF_JOIN)) return plan
75+
76+
// We always remove the special metadata from `AttributeReference` at the end of this rule, so
77+
// Dataset column reference only exists in the root node via Dataset transformations like
78+
// `Dataset#select`.
79+
val colRefAttrs = plan.expressions.flatMap(_.collect {
80+
case a: AttributeReference if isColumnReference(a) => a
81+
})
82+
83+
if (colRefAttrs.nonEmpty) {
84+
val colRefs = colRefAttrs.map(toColumnReference).distinct
85+
val ambiguousColRefs = mutable.HashSet.empty[ColumnReference]
86+
val dsIdSet = colRefs.map(_.datasetId).toSet
87+
88+
plan.foreach {
89+
case LogicalPlanWithDatasetId(p, id) if dsIdSet.contains(id) =>
90+
colRefs.foreach { ref =>
91+
if (id == ref.datasetId) {
92+
if (ref.colPos < 0 || ref.colPos >= p.output.length) {
93+
throw new IllegalStateException("[BUG] Hit an invalid Dataset column reference: " +
94+
s"$ref. Please open a JIRA ticket to report it.")
95+
} else {
96+
// When self-join happens, the analyzer asks the right side plan to generate
97+
// attributes with new exprIds. If a plan of a Dataset outputs an attribute which
98+
// is referred by a column reference, and this attribute has different exprId than
99+
// the attribute of column reference, then the column reference is ambiguous, as it
100+
// refers to a column that gets regenerated by self-join.
101+
val actualAttr = p.output(ref.colPos).asInstanceOf[AttributeReference]
102+
if (actualAttr.exprId != ref.exprId) {
103+
ambiguousColRefs += ref
104+
}
105+
}
106+
}
107+
}
108+
109+
case _ =>
110+
}
111+
112+
val ambiguousAttrs: Seq[AttributeReference] = plan match {
113+
case Join(
114+
LogicalPlanWithDatasetId(_, leftId),
115+
LogicalPlanWithDatasetId(_, rightId),
116+
_, condition, _) =>
117+
// If we are dealing with root join node, we need to take care of SPARK-6231:
118+
// 1. We can de-ambiguous `df("col") === df("col")` in the join condition.
119+
// 2. There is no ambiguity in direct self join like
120+
// `df.join(df, df("col") === 1)`, because it doesn't matter which side the
121+
// column comes from.
122+
def getAmbiguousAttrs(expr: Expression): Seq[AttributeReference] = expr match {
123+
case Equality(AttrWithCast(a), AttrWithCast(b)) if a.sameRef(b) =>
124+
Nil
125+
case Equality(AttrWithCast(a), b) if leftId == rightId && b.foldable =>
126+
Nil
127+
case Equality(a, AttrWithCast(b)) if leftId == rightId && a.foldable =>
128+
Nil
129+
case a: AttributeReference =>
130+
if (isColumnReference(a)) {
131+
val colRef = toColumnReference(a)
132+
if (ambiguousColRefs.contains(colRef)) Seq(a) else Nil
133+
} else {
134+
Nil
135+
}
136+
case _ => expr.children.flatMap(getAmbiguousAttrs)
137+
}
138+
condition.toSeq.flatMap(getAmbiguousAttrs)
139+
140+
case _ => ambiguousColRefs.toSeq.map { ref =>
141+
colRefAttrs.find(attr => toColumnReference(attr) == ref).get
142+
}
143+
}
144+
145+
if (ambiguousAttrs.nonEmpty) {
146+
throw new AnalysisException(s"Column ${ambiguousAttrs.mkString(", ")} are ambiguous. " +
147+
"It's probably because you joined several Datasets together, and some of these " +
148+
"Datasets are the same. This column points to one of the Datasets but Spark is unable " +
149+
"to figure out which one. Please alias the Datasets with different names via " +
150+
"`Dataset.as` before joining them, and specify the column using qualified name, e.g. " +
151+
"""`df.as("a").join(df.as("b"), $"a.id" > $"b.id")`. You can also set """ +
152+
s"${SQLConf.FAIL_AMBIGUOUS_SELF_JOIN.key} to false to disable this check.")
153+
}
154+
}
155+
156+
plan.transformExpressions {
157+
case a: AttributeReference if isColumnReference(a) =>
158+
// Remove the special metadata from this `AttributeReference`, as the detection is done.
159+
Column.stripColumnReferenceMetadata(a)
160+
}
161+
}
162+
}

sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ import org.apache.spark.sql.catalyst.parser.ParserInterface
2626
import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
2727
import org.apache.spark.sql.catalyst.rules.Rule
2828
import org.apache.spark.sql.execution.{QueryExecution, SparkOptimizer, SparkPlanner, SparkSqlParser}
29+
import org.apache.spark.sql.execution.analysis.DetectAmbiguousSelfJoin
2930
import org.apache.spark.sql.execution.datasources._
3031
import org.apache.spark.sql.streaming.StreamingQueryManager
3132
import org.apache.spark.sql.util.ExecutionListenerManager
@@ -161,7 +162,8 @@ abstract class BaseSessionStateBuilder(
161162
customResolutionRules
162163

163164
override val postHocResolutionRules: Seq[Rule[LogicalPlan]] =
164-
PreprocessTableCreation(session) +:
165+
new DetectAmbiguousSelfJoin(conf) +:
166+
PreprocessTableCreation(session) +:
165167
PreprocessTableInsertion(conf) +:
166168
DataSourceAnalysis(conf) +:
167169
customPostHocResolutionRules

sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,21 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext {
167167
Row(null, null, 1, 1, 3) :: Nil
168168
)
169169

170+
// use column reference in `grouping_id` instead of column name
171+
checkAnswer(
172+
courseSales.cube("course", "year")
173+
.agg(grouping_id(courseSales("course"), courseSales("year"))),
174+
Row("Java", 2012, 0) ::
175+
Row("Java", 2013, 0) ::
176+
Row("Java", null, 1) ::
177+
Row("dotNET", 2012, 0) ::
178+
Row("dotNET", 2013, 0) ::
179+
Row("dotNET", null, 1) ::
180+
Row(null, 2012, 2) ::
181+
Row(null, 2013, 2) ::
182+
Row(null, null, 3) :: Nil
183+
)
184+
170185
intercept[AnalysisException] {
171186
courseSales.groupBy().agg(grouping("course")).explain()
172187
}

0 commit comments

Comments
 (0)