-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-17502] [SQL] Fix Multiple Bugs in DDL Statements on Temporary Views #15054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #65240 has finished for PR 15054 at commit
|
Test build #65299 has finished for PR 15054 at commit
|
cc @cloud-fan @yhuai |
@@ -91,6 +91,9 @@ case class AnalyzeTableCommand(tableName: String, noscan: Boolean = true) extend | |||
case logicalRel: LogicalRelation if logicalRel.catalogTable.isDefined => | |||
updateTableStats(logicalRel.catalogTable.get, logicalRel.relation.sizeInBytes) | |||
|
|||
case o if !o.isInstanceOf[LeafNode] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better fix is: when do loopupRelation
at L40, we always pass in a table identifier with database, so that it won't search the temp views.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah! I like it!
@@ -913,7 +913,7 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach { | |||
withTempView("show1a", "show2b") { | |||
sql( | |||
""" | |||
|CREATE TEMPORARY TABLE show1a | |||
|CREATE TEMPORARY VIEW show1a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep them unchanged, as CREATE TEMPORARY TABLE
is still supported(but deprecated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will roll it back
Will fix the remaining commands in |
Test build #65389 has finished for PR 15054 at commit
|
Test build #65381 has finished for PR 15054 at commit
|
* assume the table/view is in the current database. If the specified table/view is not found | ||
* in the database then a [[NoSuchTableException]] is thrown. | ||
*/ | ||
def getNonTempTableMetadata(name: TableIdentifier): CatalogTable = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan In your PR #14962, getTableMetadata
will not deal with temporary table. To simplify the potential conflict resolution, I just introduce a new function getNonTempTableMetadata
. Thus, you can just basically replace it by the new getTableMetadata
. Does it sound OK to you? Sorry for any inconvenience it might cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think getTableMetadata
should handle temp view, maybe we can fix it in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do it soon.
}.getMessage | ||
assert(e.contains(s"Operation not allowed: TRUNCATE TABLE on temporary tables: `$viewName`")) | ||
private def assertNoSuchTable(query: String): Unit = { | ||
intercept[NoSuchTableException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, the behaviors are consistent. For the DDL commands that are unable to handle temporary views, we issue the same exception NoSuchTableException
.
Test build #65413 has finished for PR 15054 at commit
|
@@ -360,6 +360,7 @@ trait CheckAnalysis extends PredicateHelper { | |||
|
|||
case InsertIntoTable(t, _, _, _, _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @yhuai do you remember why we have this check? InsertIntoTable
can only be used for table right? When will we hit this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark.sql(
s"""CREATE TEMPORARY VIEW normal_orc_source
|USING org.apache.spark.sql.hive.orc
|OPTIONS (
| PATH '${new File(orcTableAsDir.getAbsolutePath).getCanonicalPath}'
|)
""".stripMargin)
sql("INSERT INTO TABLE normal_orc_source SELECT * FROM orc_temp_table WHERE intField > 5")
We allow users to insert rows into temporary views that is created using CREATE TEMPORARY VIEW
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the temporary view is created by createOrReplaceTempView
, users are unable to insert rows into the temporary view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this might be designed for JDBC data sources at the beginning. Just a guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably dates back to when we called these temp tables and when we didn't have persistent data source tables. Given that I think we probably have to support this for compatibility. I'm not sure why this isn't a whitelist though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the reason is LogicalRelation
is not in the package of sql\catalyst
. Otherwise, it should be like
case InsertIntoTable(t, _, _, _, _)
if !t.isInstanceOf[SimpleCatalogRelation] && !t.isInstanceOf[LogicalRelation]
Test build #65428 has finished for PR 15054 at commit
|
Test build #65426 has finished for PR 15054 at commit
|
retest this please |
Test build #65444 has finished for PR 15054 at commit
|
@@ -65,7 +64,11 @@ case class CreateTableLikeCommand( | |||
s"Source table in CREATE TABLE LIKE does not exist: '$sourceTable'") | |||
} | |||
|
|||
val sourceTableDesc = catalog.getTableMetadata(sourceTable) | |||
val sourceTableDesc = if (catalog.isTemporaryTable(sourceTable)) { | |||
catalog.getTempViewMetadata(sourceTable.table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not atomic, we should have a method that return temp view metadata or None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you are right. Let me fix it.
retest this please |
Test build #65574 has finished for PR 15054 at commit
|
Test build #65587 has finished for PR 15054 at commit
|
val table = formatTableName(name.table) | ||
val tid = TableIdentifier(table) | ||
if (isTemporaryTable(name)) { | ||
def getTempViewMetadata(name: String): CatalogTable = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move these 2 methods to the right section: handles only temp view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do we need this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: ) Just removed getTempViewMetadata
and moved getTempViewMetadataOption
to the position you want
@@ -37,7 +38,9 @@ case class AnalyzeTableCommand(tableName: String, noscan: Boolean = true) extend | |||
override def run(sparkSession: SparkSession): Seq[Row] = { | |||
val sessionState = sparkSession.sessionState | |||
val tableIdent = sessionState.sqlParser.parseTableIdentifier(tableName) | |||
val relation = EliminateSubqueryAliases(sessionState.catalog.lookupRelation(tableIdent)) | |||
val db = tableIdent.database.getOrElse(sessionState.catalog.getCurrentDatabase) | |||
val qualifiedName = TableIdentifier(tableIdent.table, Some(db)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's follow the existing convention and call it tableIdentWithDB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Clean all the related naming issues.
Test build #65610 has finished for PR 15054 at commit
|
* Retrieve the metadata of an existing temporary view. | ||
* If the temporary view does not exist, return None. | ||
*/ | ||
def getTempViewMetadataOption(name: String): Option[CatalogTable] = synchronized { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we always use it with the pattern getTempViewMetadataOption.getOrElse(getTableMetadata)
, maybe we should just rename it to getTempViewOrPermanentTableMetadata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can combine them. Let me do it. Thanks!
* If the specified table/view is not found in the database then a [[NoSuchTableException]] is | ||
* thrown. | ||
*/ | ||
def getTempViewOrPermanentTableMetadata(name: TableIdentifier): CatalogTable = synchronized { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should just take a string.
@@ -444,7 +444,7 @@ class SessionCatalogSuite extends SparkFunSuite { | |||
assert(!catalog.tableExists(TableIdentifier("view1", Some("default")))) | |||
} | |||
|
|||
test("getTableMetadata on temporary views") { | |||
test("getTableMetadata and getTempViewOrPermanentTableMetadata on temporary views") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it's unnecessary to test getTableMetadata
on temporary views, let's just test getTempViewOrPermanentTableMetadata
here.
Test build #65637 has finished for PR 15054 at commit
|
Test build #65638 has finished for PR 15054 at commit
|
retest this please |
Test build #65642 has finished for PR 15054 at commit
|
LGTM, merging to master! |
JIRA is down, I'll resolve the ticket later. |
@gatorsmile how hard is it to backport this PR to 2.0? |
It should be not hard. Will make a try today to backport to 2.0. Thanks! |
## What changes were proposed in this pull request? After apache#15054 , there is no place in Spark SQL that need `SessionCatalog.tableExists` to check temp views, so this PR makes `SessionCatalog.tableExists` only check permanent table/view and removes some hacks. This PR also improves the `getTempViewOrPermanentTableMetadata` that is introduced in apache#15054 , to make the code simpler. ## How was this patch tested? existing tests Author: Wenchen Fan <wenchen@databricks.com> Closes apache#15160 from cloud-fan/exists.
…tements on Temporary Views ### What changes were proposed in this pull request? This PR is to backport #15054 and #15160 to Spark 2.0. - When the permanent tables/views do not exist but the temporary view exists, the expected error should be `NoSuchTableException` for partition-related ALTER TABLE commands. However, it always reports a confusing error message. For example, ``` Partition spec is invalid. The spec (a, b) must match the partition spec () defined in table '`testview`'; ``` - When the permanent tables/views do not exist but the temporary view exists, the expected error should be `NoSuchTableException` for `ALTER TABLE ... UNSET TBLPROPERTIES`. However, it reports a missing table property. For example, ``` Attempted to unset non-existent property 'p' in table '`testView`'; ``` - When `ANALYZE TABLE` is called on a view or a temporary view, we should issue an error message. However, it reports a strange error: ``` ANALYZE TABLE is not supported for Project ``` - When inserting into a temporary view that is generated from `Range`, we will get the following error message: ``` assertion failed: No plan for 'InsertIntoTable Range (0, 10, step=1, splits=Some(1)), false, false +- Project [1 AS 1#20] +- OneRowRelation$ ``` This PR is to fix the above four issues. There is no place in Spark SQL that need `SessionCatalog.tableExists` to check temp views, so this PR makes `SessionCatalog.tableExists` only check permanent table/view and removes some hacks. ### How was this patch tested? Added multiple test cases Author: gatorsmile <gatorsmile@gmail.com> Closes #15174 from gatorsmile/PR15054Backport.
What changes were proposed in this pull request?
NoSuchTableException
for partition-related ALTER TABLE commands. However, it always reports a confusing error message. For example,NoSuchTableException
forALTER TABLE ... UNSET TBLPROPERTIES
. However, it reports a missing table property. For example,ANALYZE TABLE
is called on a view or a temporary view, we should issue an error message. However, it reports a strange error:Range
, we will get the following error message:This PR is to fix the above four issues.
How was this patch tested?
Added multiple test cases