Skip to content

[SPARK-33696][BUILD][SQL] Upgrade built-in Hive to 2.3.8 #754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion dev/deps/spark-deps-hadoop-palantir
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,21 @@ hadoop-yarn-client/2.9.2-palantir.15//hadoop-yarn-client-2.9.2-palantir.15.jar
hadoop-yarn-common/2.9.2-palantir.15//hadoop-yarn-common-2.9.2-palantir.15.jar
hadoop-yarn-registry/2.9.2-palantir.15//hadoop-yarn-registry-2.9.2-palantir.15.jar
hadoop-yarn-server-common/2.9.2-palantir.15//hadoop-yarn-server-common-2.9.2-palantir.15.jar
hive-storage-api/2.7.1//hive-storage-api-2.7.1.jar
hive-beeline/2.3.8//hive-beeline-2.3.8.jar
hive-cli/2.3.8//hive-cli-2.3.8.jar
hive-common/2.3.8//hive-common-2.3.8.jar
hive-exec/2.3.8/core/hive-exec-2.3.8-core.jar
hive-jdbc/2.3.8//hive-jdbc-2.3.8.jar
hive-llap-common/2.3.8//hive-llap-common-2.3.8.jar
hive-metastore/2.3.8//hive-metastore-2.3.8.jar
hive-serde/2.3.8//hive-serde-2.3.8.jar
hive-service-rpc/3.1.2//hive-service-rpc-3.1.2.jar
hive-shims-0.23/2.3.8//hive-shims-0.23-2.3.8.jar
hive-shims-common/2.3.8//hive-shims-common-2.3.8.jar
hive-shims-scheduler/2.3.8//hive-shims-scheduler-2.3.8.jar
hive-shims/2.3.8//hive-shims-2.3.8.jar
hive-storage-api/2.7.2//hive-storage-api-2.7.2.jar
hive-vector-code-gen/2.3.8//hive-vector-code-gen-2.3.8.jar
hk2-api/2.6.1//hk2-api-2.6.1.jar
hk2-locator/2.6.1//hk2-locator-2.6.1.jar
hk2-utils/2.6.1//hk2-utils-2.6.1.jar
Expand Down
4 changes: 2 additions & 2 deletions docs/building-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,9 @@ Example:

To enable Hive integration for Spark SQL along with its JDBC server and CLI,
add the `-Phive` and `-Phive-thriftserver` profiles to your existing build options.
By default Spark will build with Hive 2.3.7.
By default Spark will build with Hive 2.3.8.

# With Hive 2.3.7 support
# With Hive 2.3.8 support
./build/mvn -Pyarn -Phive -Phive-thriftserver -DskipTests clean package

## Packaging without Hadoop Dependencies for YARN
Expand Down
8 changes: 4 additions & 4 deletions docs/sql-data-sources-hive-tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,10 +127,10 @@ The following options can be used to configure the version of Hive that is used
<tr><th>Property Name</th><th>Default</th><th>Meaning</th><th>Since Version</th></tr>
<tr>
<td><code>spark.sql.hive.metastore.version</code></td>
<td><code>2.3.7</code></td>
<td><code>2.3.8</code></td>
<td>
Version of the Hive metastore. Available
options are <code>0.12.0</code> through <code>2.3.7</code> and <code>3.0.0</code> through <code>3.1.2</code>.
options are <code>0.12.0</code> through <code>2.3.8</code> and <code>3.0.0</code> through <code>3.1.2</code>.
</td>
<td>1.4.0</td>
</tr>
Expand All @@ -142,9 +142,9 @@ The following options can be used to configure the version of Hive that is used
property can be one of three options:
<ol>
<li><code>builtin</code></li>
Use Hive 2.3.7, which is bundled with the Spark assembly when <code>-Phive</code> is
Use Hive 2.3.8, which is bundled with the Spark assembly when <code>-Phive</code> is
enabled. When this option is chosen, <code>spark.sql.hive.metastore.version</code> must be
either <code>2.3.7</code> or not defined.
either <code>2.3.8</code> or not defined.
<li><code>maven</code></li>
Use Hive jars of specified version downloaded from Maven repositories. This configuration
is not generally recommended for production deployments.
Expand Down
2 changes: 1 addition & 1 deletion docs/sql-migration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -810,7 +810,7 @@ Python UDF registration is unchanged.
Spark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs.
Currently, Hive SerDes and UDFs are based on built-in Hive,
and Spark SQL can be connected to different versions of Hive Metastore
(from 0.12.0 to 2.3.7 and 3.0.0 to 3.1.2. Also see [Interacting with Different Versions of Hive Metastore](sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore)).
(from 0.12.0 to 2.3.8 and 3.0.0 to 3.1.2. Also see [Interacting with Different Versions of Hive Metastore](sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore)).

#### Deploying in Existing Hive Warehouses
{:.no_toc}
Expand Down
20 changes: 18 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@
<hive.group>org.apache.hive</hive.group>
<hive.classifier>core</hive.classifier>
<!-- Version used in Maven Hive dependency -->
<hive.version>2.3.7</hive.version>
<hive23.version>2.3.7</hive23.version>
<hive.version>2.3.8</hive.version>
<hive23.version>2.3.8</hive23.version>
<!-- Version used for internal directory structure -->
<hive.version.short>2.3</hive.version.short>
<!-- note that this should be compatible with Kafka brokers version 0.10 and up -->
Expand Down Expand Up @@ -1846,6 +1846,22 @@
<groupId>org.apache.logging.log4j</groupId>
<artifactId>*</artifactId>
</exclusion>
<exclusion>
<groupId>net.hydromatic</groupId>
<artifactId>eigenbase-properties</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
</exclusion>
<exclusion>
<groupId>org.pentaho</groupId>
<artifactId>pentaho-aggdesigner-algorithm</artifactId>
</exclusion>
<!-- End of Hive 2.3 exclusion -->
</exclusions>
</dependency>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -545,7 +545,7 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest {
}

if (HiveUtils.isHive23) {
assert(conf.get(HiveUtils.FAKE_HIVE_VERSION.key) === Some("2.3.7"))
assert(conf.get(HiveUtils.FAKE_HIVE_VERSION.key) === Some("2.3.8"))
} else {
assert(conf.get(HiveUtils.FAKE_HIVE_VERSION.key) === Some("1.2.1"))
}
Expand All @@ -562,7 +562,7 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest {
}

if (HiveUtils.isHive23) {
assert(conf.get(HiveUtils.FAKE_HIVE_VERSION.key) === Some("2.3.7"))
assert(conf.get(HiveUtils.FAKE_HIVE_VERSION.key) === Some("2.3.8"))
} else {
assert(conf.get(HiveUtils.FAKE_HIVE_VERSION.key) === Some("1.2.1"))
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ private[spark] object HiveUtils extends Logging {

val HIVE_METASTORE_VERSION = buildStaticConf("spark.sql.hive.metastore.version")
.doc("Version of the Hive metastore. Available options are " +
"<code>0.12.0</code> through <code>2.3.7</code> and " +
"<code>0.12.0</code> through <code>2.3.8</code> and " +
"<code>3.0.0</code> through <code>3.1.2</code>.")
.version("1.4.0")
.stringConf
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@ private[hive] object IsolatedClientLoader extends Logging {
case "2.0" | "2.0.0" | "2.0.1" => hive.v2_0
case "2.1" | "2.1.0" | "2.1.1" => hive.v2_1
case "2.2" | "2.2.0" => hive.v2_2
case "2.3" | "2.3.0" | "2.3.1" | "2.3.2" | "2.3.3" | "2.3.4" | "2.3.5" | "2.3.6" | "2.3.7" =>
hive.v2_3
case "2.3" | "2.3.0" | "2.3.1" | "2.3.2" | "2.3.3" | "2.3.4" | "2.3.5" | "2.3.6" | "2.3.7" |
"2.3.8" => hive.v2_3
case "3.0" | "3.0.0" => hive.v3_0
case "3.1" | "3.1.0" | "3.1.1" | "3.1.2" => hive.v3_1
case version =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,13 @@ package object client {
"org.apache.curator:*",
"org.pentaho:pentaho-aggdesigner-algorithm"))

// Since HIVE-14496, Hive materialized view need calcite-core.
// Since HIVE-23980, calcite-core included in Hive package jar.
// For spark, only VersionsSuite currently creates a hive materialized view for testing.
case object v2_3 extends HiveVersion("2.3.7",
exclusions = Seq("org.apache.calcite:calcite-druid",
case object v2_3 extends HiveVersion("2.3.8",
exclusions = Seq("org.apache.calcite:calcite-core",
"org.apache.calcite:calcite-druid",
"org.apache.calcite.avatica:avatica",
"com.fasterxml.jackson.core:*",
"org.apache.curator:*",
"org.pentaho:pentaho-aggdesigner-algorithm"))

Expand All @@ -101,7 +103,6 @@ package object client {
extraDeps = Seq("org.apache.logging.log4j:log4j-api:2.10.0",
"org.apache.derby:derby:10.14.1.0"),
exclusions = Seq("org.apache.calcite:calcite-druid",
"org.apache.calcite.avatica:avatica",
"org.apache.curator:*",
"org.pentaho:pentaho-aggdesigner-algorithm"))

Expand All @@ -111,7 +112,6 @@ package object client {
extraDeps = Seq("org.apache.logging.log4j:log4j-api:2.10.0",
"org.apache.derby:derby:10.14.1.0"),
exclusions = Seq("org.apache.calcite:calcite-druid",
"org.apache.calcite.avatica:avatica",
"org.apache.curator:*",
"org.pentaho:pentaho-aggdesigner-algorithm"))

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,11 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
// avoid downloading Spark of different versions in each run.
private val sparkTestingDir = new File("/tmp/test-spark")
private val unusedJar = TestUtils.createJarWithClasses(Seq.empty)
val hiveVersion = if (SystemUtils.isJavaVersionAtLeast(JavaVersion.JAVA_9)) {
"2.3.8"
} else {
"1.2.1"
}

override def afterAll(): Unit = {
try {
Expand Down