Skip to content

Commit 0706e64

Browse files
committed
[SPARK-30098][SQL] Add a configuration to use default datasource as provider for CREATE TABLE command
### What changes were proposed in this pull request? For CRETE TABLE [AS SELECT] command, creates native Parquet table if neither USING nor STORE AS is specified and `spark.sql.legacy.createHiveTableByDefault` is false. This is a retry after we unify the CREATE TABLE syntax. It partially reverts d2bec5e This PR allows `CREATE EXTERNAL TABLE` when `LOCATION` is present. This was not allowed for data source tables before, which is an unnecessary behavior different with hive tables. ### Why are the changes needed? Changing from Hive text table to native Parquet table has many benefits: 1. be consistent with `DataFrameWriter.saveAsTable`. 2. better performance 3. better support for nested types (Hive text table doesn't work well with nested types, e.g. `insert into t values struct(null)` actually inserts a null value not `struct(null)` if `t` is a Hive text table, which leads to wrong result) 4. better interoperability as Parquet is a more popular open file format. ### Does this PR introduce _any_ user-facing change? No by default. If the config is set, the behavior change is described below: Behavior-wise, the change is very small as the native Parquet table is also Hive-compatible. All the Spark DDL commands that works for hive tables also works for native Parquet tables, with two exceptions: `ALTER TABLE SET [SERDE | SERDEPROPERTIES]` and `LOAD DATA`. char/varchar behavior has been taken care by #30412, and there is no behavior difference between data source and hive tables. One potential issue is `CREATE TABLE ... LOCATION ...` while users want to directly access the files later. It's more like a corner case and the legacy config should be good enough. Another potential issue is users may use Spark to create the table and then use Hive to add partitions with different serde. This is not allowed for Spark native tables. ### How was this patch tested? Re-enable the tests Closes #30554 from cloud-fan/create-table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 512fb32 commit 0706e64

File tree

15 files changed

+100
-45
lines changed

15 files changed

+100
-45
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2921,6 +2921,15 @@ object SQLConf {
29212921
.stringConf
29222922
.createWithDefault("")
29232923

2924+
val LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT =
2925+
buildConf("spark.sql.legacy.createHiveTableByDefault")
2926+
.internal()
2927+
.doc("When set to true, CREATE TABLE syntax without USING or STORED AS will use Hive " +
2928+
s"instead of the value of ${DEFAULT_DATA_SOURCE_NAME.key} as the table provider.")
2929+
.version("3.1.0")
2930+
.booleanConf
2931+
.createWithDefault(true)
2932+
29242933
/**
29252934
* Holds information about keys that have been deprecated.
29262935
*

sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ import org.apache.spark.sql.connector.expressions.Transform
2727
import org.apache.spark.sql.execution.command._
2828
import org.apache.spark.sql.execution.datasources.{CreateTable, DataSource}
2929
import org.apache.spark.sql.execution.datasources.v2.FileDataSourceV2
30-
import org.apache.spark.sql.internal.HiveSerDe
30+
import org.apache.spark.sql.internal.{HiveSerDe, SQLConf}
3131
import org.apache.spark.sql.types.{MetadataBuilder, StructField, StructType}
3232

3333
/**
@@ -636,11 +636,16 @@ class ResolveSessionCatalog(
636636
(storageFormat, DDLUtils.HIVE_PROVIDER)
637637
} else {
638638
// If neither USING nor STORED AS/ROW FORMAT is specified, we create native data source
639-
// tables if it's a CTAS and `conf.convertCTAS` is true.
640-
// TODO: create native data source table by default for non-CTAS.
641-
if (ctas && conf.convertCTAS) {
639+
// tables if:
640+
// 1. `LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT` is false, or
641+
// 2. It's a CTAS and `conf.convertCTAS` is true.
642+
val createHiveTableByDefault = conf.getConf(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT)
643+
if (!createHiveTableByDefault || (ctas && conf.convertCTAS)) {
642644
(nonHiveStorageFormat, conf.defaultDataSourceName)
643645
} else {
646+
logWarning("A Hive serde table will be created as there is no table provider " +
647+
s"specified. You can set ${SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT.key} to false " +
648+
"so that native data source table will be created instead.")
644649
(defaultHiveStorage, DDLUtils.HIVE_PROVIDER)
645650
}
646651
}

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -266,22 +266,23 @@ class DataSourceV2SQLSuite
266266
checkAnswer(spark.internalCreateDataFrame(rdd, table.schema), Seq.empty)
267267
}
268268

269-
// TODO: ignored by SPARK-31707, restore the test after create table syntax unification
270-
ignore("CreateTable: without USING clause") {
271-
// unset this config to use the default v2 session catalog.
272-
spark.conf.unset(V2_SESSION_CATALOG_IMPLEMENTATION.key)
273-
val testCatalog = catalog("testcat").asTableCatalog
274-
275-
sql("CREATE TABLE testcat.t1 (id int)")
276-
val t1 = testCatalog.loadTable(Identifier.of(Array(), "t1"))
277-
// Spark shouldn't set the default provider for catalog plugins.
278-
assert(!t1.properties.containsKey(TableCatalog.PROP_PROVIDER))
279-
280-
sql("CREATE TABLE t2 (id int)")
281-
val t2 = spark.sessionState.catalogManager.v2SessionCatalog.asTableCatalog
282-
.loadTable(Identifier.of(Array("default"), "t2")).asInstanceOf[V1Table]
283-
// Spark should set the default provider as DEFAULT_DATA_SOURCE_NAME for the session catalog.
284-
assert(t2.v1Table.provider == Some(conf.defaultDataSourceName))
269+
test("CreateTable: without USING clause") {
270+
withSQLConf(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT.key -> "false") {
271+
// unset this config to use the default v2 session catalog.
272+
spark.conf.unset(V2_SESSION_CATALOG_IMPLEMENTATION.key)
273+
val testCatalog = catalog("testcat").asTableCatalog
274+
275+
sql("CREATE TABLE testcat.t1 (id int)")
276+
val t1 = testCatalog.loadTable(Identifier.of(Array(), "t1"))
277+
// Spark shouldn't set the default provider for catalog plugins.
278+
assert(!t1.properties.containsKey(TableCatalog.PROP_PROVIDER))
279+
280+
sql("CREATE TABLE t2 (id int)")
281+
val t2 = spark.sessionState.catalogManager.v2SessionCatalog.asTableCatalog
282+
.loadTable(Identifier.of(Array("default"), "t2")).asInstanceOf[V1Table]
283+
// Spark should set the default provider as DEFAULT_DATA_SOURCE_NAME for the session catalog.
284+
assert(t2.v1Table.provider == Some(conf.defaultDataSourceName))
285+
}
285286
}
286287

287288
test("CreateTable/RepalceTable: invalid schema if has interval type") {

sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1588,7 +1588,7 @@ class PlanResolutionSuite extends AnalysisTest {
15881588
.add("b", StringType)
15891589
)
15901590
)
1591-
compare("CREATE TABLE my_tab(a INT COMMENT 'test', b STRING) " +
1591+
compare("CREATE TABLE my_tab(a INT COMMENT 'test', b STRING) STORED AS textfile " +
15921592
"PARTITIONED BY (c INT, d STRING COMMENT 'test2')",
15931593
createTable(
15941594
table = "my_tab",
@@ -1616,7 +1616,7 @@ class PlanResolutionSuite extends AnalysisTest {
16161616
)
16171617
// Partitioned by a StructType should be accepted by `SparkSqlParser` but will fail an analyze
16181618
// rule in `AnalyzeCreateTable`.
1619-
compare("CREATE TABLE my_tab(a INT COMMENT 'test', b STRING) " +
1619+
compare("CREATE TABLE my_tab(a INT COMMENT 'test', b STRING) STORED AS textfile " +
16201620
"PARTITIONED BY (nested STRUCT<col1: STRING,col2: INT>)",
16211621
createTable(
16221622
table = "my_tab",
@@ -1890,7 +1890,7 @@ class PlanResolutionSuite extends AnalysisTest {
18901890
}
18911891

18921892
test("Test CTAS #3") {
1893-
val s3 = """CREATE TABLE page_view AS SELECT * FROM src"""
1893+
val s3 = """CREATE TABLE page_view STORED AS textfile AS SELECT * FROM src"""
18941894
val (desc, exists) = extractTableDesc(s3)
18951895
assert(exists == false)
18961896
assert(desc.identifier.database == Some("default"))

sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter {
4040
private val originalInMemoryPartitionPruning = TestHive.conf.inMemoryPartitionPruning
4141
private val originalCrossJoinEnabled = TestHive.conf.crossJoinEnabled
4242
private val originalSessionLocalTimeZone = TestHive.conf.sessionLocalTimeZone
43+
private val originalCreateHiveTable =
44+
TestHive.conf.getConf(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT)
4345

4446
def testCases: Seq[(String, File)] = {
4547
hiveQueryDir.listFiles.map(f => f.getName.stripSuffix(".q") -> f)
@@ -59,6 +61,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter {
5961
// Fix session local timezone to America/Los_Angeles for those timezone sensitive tests
6062
// (timestamp_*)
6163
TestHive.setConf(SQLConf.SESSION_LOCAL_TIMEZONE, "America/Los_Angeles")
64+
TestHive.setConf(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT, true)
6265
RuleExecutor.resetMetrics()
6366
}
6467

@@ -69,6 +72,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter {
6972
TestHive.setConf(SQLConf.IN_MEMORY_PARTITION_PRUNING, originalInMemoryPartitionPruning)
7073
TestHive.setConf(SQLConf.CROSS_JOINS_ENABLED, originalCrossJoinEnabled)
7174
TestHive.setConf(SQLConf.SESSION_LOCAL_TIMEZONE, originalSessionLocalTimeZone)
75+
TestHive.setConf(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT, originalCreateHiveTable)
7276

7377
// For debugging dump some statistics about how much time was spent in various optimizer rules
7478
logWarning(RuleExecutor.dumpTimeSpent())

sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveShowCreateTableSuite.scala

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,26 @@ import org.apache.spark.sql.{AnalysisException, ShowCreateTableSuite}
2121
import org.apache.spark.sql.catalyst.TableIdentifier
2222
import org.apache.spark.sql.catalyst.catalog.CatalogTable
2323
import org.apache.spark.sql.hive.test.TestHiveSingleton
24-
import org.apache.spark.sql.internal.HiveSerDe
24+
import org.apache.spark.sql.internal.{HiveSerDe, SQLConf}
2525

2626
class HiveShowCreateTableSuite extends ShowCreateTableSuite with TestHiveSingleton {
2727

28+
private var origCreateHiveTableConfig = false
29+
30+
protected override def beforeAll(): Unit = {
31+
super.beforeAll()
32+
origCreateHiveTableConfig =
33+
spark.conf.get(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT)
34+
spark.conf.set(SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT.key, true)
35+
}
36+
37+
protected override def afterAll(): Unit = {
38+
spark.conf.set(
39+
SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT.key,
40+
origCreateHiveTableConfig)
41+
super.afterAll()
42+
}
43+
2844
test("view") {
2945
Seq(true, false).foreach { serde =>
3046
withView("v1") {

sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,8 @@ class InsertSuite extends QueryTest with TestHiveSingleton with BeforeAndAfter
277277
test("Test partition mode = strict") {
278278
withSQLConf(("hive.exec.dynamic.partition.mode", "strict")) {
279279
withTable("partitioned") {
280-
sql("CREATE TABLE partitioned (id bigint, data string) PARTITIONED BY (part string)")
280+
sql("CREATE TABLE partitioned (id bigint, data string) USING hive " +
281+
"PARTITIONED BY (part string)")
281282
val data = (1 to 10).map(i => (i, s"data-$i", if ((i % 2) == 0) "even" else "odd"))
282283
.toDF("id", "data", "part")
283284

sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ class QueryPartitionSuite extends QueryTest with SQLTestUtils with TestHiveSingl
3838
testData.createOrReplaceTempView("testData")
3939

4040
// create the table for test
41-
sql(s"CREATE TABLE table_with_partition(key int,value string) " +
41+
sql(s"CREATE TABLE table_with_partition(key int,value string) USING hive " +
4242
s"PARTITIONED by (ds string) location '${tmpDir.toURI}' ")
4343
sql("INSERT OVERWRITE TABLE table_with_partition partition (ds='1') " +
4444
"SELECT key,value FROM testData")
@@ -81,7 +81,8 @@ class QueryPartitionSuite extends QueryTest with SQLTestUtils with TestHiveSingl
8181

8282
test("SPARK-21739: Cast expression should initialize timezoneId") {
8383
withTable("table_with_timestamp_partition") {
84-
sql("CREATE TABLE table_with_timestamp_partition(value int) PARTITIONED BY (ts TIMESTAMP)")
84+
sql("CREATE TABLE table_with_timestamp_partition(value int) USING hive " +
85+
"PARTITIONED BY (ts TIMESTAMP)")
8586
sql("INSERT OVERWRITE TABLE table_with_timestamp_partition " +
8687
"PARTITION (ts = '2010-01-01 00:00:00.000') VALUES (1)")
8788

sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
165165
// Partitioned table
166166
val partTable = "part_table"
167167
withTable(partTable) {
168-
sql(s"CREATE TABLE $partTable (key STRING, value STRING) PARTITIONED BY (ds STRING)")
168+
sql(s"CREATE TABLE $partTable (key STRING, value STRING) USING hive " +
169+
"PARTITIONED BY (ds STRING)")
169170
sql(s"INSERT INTO TABLE $partTable PARTITION (ds='2010-01-01') SELECT * FROM src")
170171
sql(s"INSERT INTO TABLE $partTable PARTITION (ds='2010-01-02') SELECT * FROM src")
171172
sql(s"INSERT INTO TABLE $partTable PARTITION (ds='2010-01-03') SELECT * FROM src")
@@ -191,7 +192,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
191192
SQLConf.PARALLEL_FILE_LISTING_IN_STATS_COMPUTATION.key -> "True") {
192193
val checkSizeTable = "checkSizeTable"
193194
withTable(checkSizeTable) {
194-
sql(s"CREATE TABLE $checkSizeTable (key STRING, value STRING) PARTITIONED BY (ds STRING)")
195+
sql(s"CREATE TABLE $checkSizeTable (key STRING, value STRING) USING hive " +
196+
"PARTITIONED BY (ds STRING)")
195197
sql(s"INSERT INTO TABLE $checkSizeTable PARTITION (ds='2010-01-01') SELECT * FROM src")
196198
sql(s"INSERT INTO TABLE $checkSizeTable PARTITION (ds='2010-01-02') SELECT * FROM src")
197199
sql(s"INSERT INTO TABLE $checkSizeTable PARTITION (ds='2010-01-03') SELECT * FROM src")
@@ -274,7 +276,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
274276
test("SPARK-22745 - read Hive's statistics for partition") {
275277
val tableName = "hive_stats_part_table"
276278
withTable(tableName) {
277-
sql(s"CREATE TABLE $tableName (key STRING, value STRING) PARTITIONED BY (ds STRING)")
279+
sql(s"CREATE TABLE $tableName (key STRING, value STRING) USING hive " +
280+
"PARTITIONED BY (ds STRING)")
278281
sql(s"INSERT INTO TABLE $tableName PARTITION (ds='2017-01-01') SELECT * FROM src")
279282
var partition = spark.sessionState.catalog
280283
.getPartition(TableIdentifier(tableName), Map("ds" -> "2017-01-01"))
@@ -296,7 +299,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
296299
val tableName = "analyzeTable_part"
297300
withTable(tableName) {
298301
withTempPath { path =>
299-
sql(s"CREATE TABLE $tableName (key STRING, value STRING) PARTITIONED BY (ds STRING)")
302+
sql(s"CREATE TABLE $tableName (key STRING, value STRING) USING hive " +
303+
"PARTITIONED BY (ds STRING)")
300304

301305
val partitionDates = List("2010-01-01", "2010-01-02", "2010-01-03")
302306
partitionDates.foreach { ds =>
@@ -321,6 +325,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
321325
sql(
322326
s"""
323327
|CREATE TABLE $sourceTableName (key STRING, value STRING)
328+
|USING hive
324329
|PARTITIONED BY (ds STRING)
325330
|LOCATION '${path.toURI}'
326331
""".stripMargin)
@@ -338,6 +343,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
338343
sql(
339344
s"""
340345
|CREATE TABLE $tableName (key STRING, value STRING)
346+
|USING hive
341347
|PARTITIONED BY (ds STRING)
342348
|LOCATION '${path.toURI}'
343349
""".stripMargin)
@@ -371,7 +377,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
371377
}
372378

373379
withTable(tableName) {
374-
sql(s"CREATE TABLE $tableName (key STRING, value STRING) PARTITIONED BY (ds STRING)")
380+
sql(s"CREATE TABLE $tableName (key STRING, value STRING) USING hive " +
381+
"PARTITIONED BY (ds STRING)")
375382

376383
createPartition("2010-01-01", "SELECT '1', 'A' from src")
377384
createPartition("2010-01-02", "SELECT '1', 'A' from src UNION ALL SELECT '1', 'A' from src")
@@ -424,7 +431,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
424431
}
425432

426433
withTable(tableName) {
427-
sql(s"CREATE TABLE $tableName (key STRING, value STRING) PARTITIONED BY (ds STRING, hr INT)")
434+
sql(s"CREATE TABLE $tableName (key STRING, value STRING) USING hive " +
435+
"PARTITIONED BY (ds STRING, hr INT)")
428436

429437
createPartition("2010-01-01", 10, "SELECT '1', 'A' from src")
430438
createPartition("2010-01-01", 11, "SELECT '1', 'A' from src")
@@ -472,7 +480,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
472480
}
473481

474482
withTable(tableName) {
475-
sql(s"CREATE TABLE $tableName (key STRING, value STRING) PARTITIONED BY (ds STRING, hr INT)")
483+
sql(s"CREATE TABLE $tableName (key STRING, value STRING) USING hive " +
484+
"PARTITIONED BY (ds STRING, hr INT)")
476485

477486
createPartition("2010-01-01", 10, "SELECT '1', 'A' from src")
478487
createPartition("2010-01-01", 11, "SELECT '1', 'A' from src")
@@ -961,7 +970,8 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
961970
Seq(false, true).foreach { autoUpdate =>
962971
withSQLConf(SQLConf.AUTO_SIZE_UPDATE_ENABLED.key -> autoUpdate.toString) {
963972
withTable(table) {
964-
sql(s"CREATE TABLE $table (i INT, j STRING) PARTITIONED BY (ds STRING, hr STRING)")
973+
sql(s"CREATE TABLE $table (i INT, j STRING) USING hive " +
974+
"PARTITIONED BY (ds STRING, hr STRING)")
965975
// table has two partitions initially
966976
for (ds <- Seq("2008-04-08"); hr <- Seq("11", "12")) {
967977
sql(s"INSERT OVERWRITE TABLE $table PARTITION (ds='$ds',hr='$hr') SELECT 1, 'a'")
@@ -1034,6 +1044,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
10341044
sql(
10351045
s"""
10361046
|CREATE TABLE $managedTable (key INT, value STRING)
1047+
|USING hive
10371048
|PARTITIONED BY (ds STRING, hr STRING)
10381049
""".stripMargin)
10391050

sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -798,6 +798,7 @@ class VersionsSuite extends SparkFunSuite with Logging {
798798
versionSpark.sql(
799799
"""
800800
|CREATE TABLE tbl(c1 string)
801+
|USING hive
801802
|PARTITIONED BY (ds STRING)
802803
""".stripMargin)
803804
versionSpark.sql("INSERT OVERWRITE TABLE tbl partition (ds='2') SELECT '1'")

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -983,7 +983,7 @@ class HiveDDLSuite
983983
}
984984

985985
test("alter table partition - storage information") {
986-
sql("CREATE TABLE boxes (height INT, length INT) PARTITIONED BY (width INT)")
986+
sql("CREATE TABLE boxes (height INT, length INT) STORED AS textfile PARTITIONED BY (width INT)")
987987
sql("INSERT OVERWRITE TABLE boxes PARTITION (width=4) SELECT 4, 4")
988988
val catalog = spark.sessionState.catalog
989989
val expectedSerde = "com.sparkbricks.serde.ColumnarSerDe"

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,15 +88,16 @@ class HiveSerDeSuite extends HiveComparisonTest with PlanTest with BeforeAndAfte
8888
test("Test the default fileformat for Hive-serde tables") {
8989
withSQLConf("hive.default.fileformat" -> "orc") {
9090
val (desc, exists) = extractTableDesc(
91-
"CREATE TABLE IF NOT EXISTS fileformat_test (id int)")
91+
"CREATE TABLE IF NOT EXISTS fileformat_test (id int) USING hive")
9292
assert(exists)
9393
assert(desc.storage.inputFormat == Some("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat"))
9494
assert(desc.storage.outputFormat == Some("org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat"))
9595
assert(desc.storage.serde == Some("org.apache.hadoop.hive.ql.io.orc.OrcSerde"))
9696
}
9797

9898
withSQLConf("hive.default.fileformat" -> "parquet") {
99-
val (desc, exists) = extractTableDesc("CREATE TABLE IF NOT EXISTS fileformat_test (id int)")
99+
val (desc, exists) = extractTableDesc(
100+
"CREATE TABLE IF NOT EXISTS fileformat_test (id int) USING hive")
100101
assert(exists)
101102
val input = desc.storage.inputFormat
102103
val output = desc.storage.outputFormat

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveTableScanSuite.scala

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ class HiveTableScanSuite extends HiveComparisonTest with SQLTestUtils with TestH
113113
sql(
114114
s"""
115115
|CREATE TABLE $table(id string)
116+
|USING hive
116117
|PARTITIONED BY (p1 string,p2 string,p3 string,p4 string,p5 string)
117118
""".stripMargin)
118119
sql(
@@ -157,6 +158,7 @@ class HiveTableScanSuite extends HiveComparisonTest with SQLTestUtils with TestH
157158
sql(
158159
s"""
159160
|CREATE TABLE $table(id string)
161+
|USING hive
160162
|PARTITIONED BY (p1 string,p2 string,p3 string,p4 string,p5 string)
161163
""".stripMargin)
162164
sql(
@@ -182,6 +184,7 @@ class HiveTableScanSuite extends HiveComparisonTest with SQLTestUtils with TestH
182184
sql(
183185
s"""
184186
|CREATE TABLE $table (id int)
187+
|USING hive
185188
|PARTITIONED BY (a int, b int)
186189
""".stripMargin)
187190
val scan1 = getHiveTableScanExec(s"SELECT * FROM $table WHERE a = 1 AND b = 2")
@@ -252,7 +255,7 @@ class HiveTableScanSuite extends HiveComparisonTest with SQLTestUtils with TestH
252255
test("SPARK-32069: Improve error message on reading unexpected directory") {
253256
withTable("t") {
254257
withTempDir { f =>
255-
sql(s"CREATE TABLE t(i LONG) LOCATION '${f.getAbsolutePath}'")
258+
sql(s"CREATE TABLE t(i LONG) USING hive LOCATION '${f.getAbsolutePath}'")
256259
sql("INSERT INTO t VALUES(1)")
257260
val dir = new File(f.getCanonicalPath + "/data")
258261
dir.mkdir()

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2026,6 +2026,7 @@ abstract class SQLQuerySuiteBase extends QueryTest with SQLTestUtils with TestHi
20262026
sql(
20272027
"""
20282028
|CREATE TABLE part_table (c STRING)
2029+
|STORED AS textfile
20292030
|PARTITIONED BY (d STRING)
20302031
""".stripMargin)
20312032
sql(s"LOAD DATA LOCAL INPATH '$path/part-r-000011' " +

0 commit comments

Comments
 (0)