Skip to content

[SPARK-22279][SQL] Turn on spark.sql.hive.convertMetastoreOrc by default #19499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ private[spark] object HiveUtils extends Logging {
.doc("When set to true, the built-in ORC reader and writer are used to process " +
"ORC tables created by using the HiveQL syntax, instead of Hive serde.")
.booleanConf
.createWithDefault(false)
.createWithDefault(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was made in https://issues.apache.org/jira/browse/SPARK-15705.

This issue has been resolved?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's resolved as you see the last my comment on the JIRA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the JIRA, there is a result on 2.1.1 and 2.2.0. And the following is the result on 2.2.1.

scala> sql("set spark.sql.hive.convertMetastoreOrc=true")
scala> spark.table("default.test").printSchema
root
 |-- id: long (nullable = true)
 |-- name: string (nullable = true)
 |-- state: string (nullable = true)

scala> spark.version
res2: String = 2.2.1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this PR: #19470 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's before #19470 because it's fixed on 2.1.1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That is resolved via https://issues.apache.org/jira/browse/SPARK-14387 by me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, please wait a moment. I'll double check the case to make it sure.

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. It's resolved via SPARK-14387. The following is a result of SPARK-15757 example on 2.2.1.

hive> CREATE TABLE source(inv_date_sk INT, inv_item_sk INT, inv_warehouse_sk INT, inv_quantity_on_hand INT);
hive> INSERT INTO source VALUES(1,1,1,1);
hive> CREATE TABLE inventory(inv_date_sk INT, inv_item_sk INT, inv_warehouse_sk INT, inv_quantity_on_hand INT)
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS ORC;
hive> INSERT OVERWRITE TABLE inventory SELECT * FROM source;

scala> sql("set spark.sql.hive.convertMetastoreOrc=true")
scala> sql("SELECT * FROM inventory").show
+-----------+-----------+----------------+--------------------+
|inv_date_sk|inv_item_sk|inv_warehouse_sk|inv_quantity_on_hand|
+-----------+-----------+----------------+--------------------+
|          1|          1|               1|                   1|
+-----------+-----------+----------------+--------------------+
scala> spark.version
res2: String = 2.2.1


val HIVE_METASTORE_SHARED_PREFIXES = buildConf("spark.sql.hive.metastore.sharedPrefixes")
.doc("A comma separated list of class prefixes that should be loaded using the classloader " +
Expand Down