[SPARK-18538] [SQL] [Backport-2.1] Fix Concurrent Table Fetching Using DataFrameReader JDBC APIs by gatorsmile · Pull Request #16111 · apache/spark

gatorsmile · 2016-12-01T23:52:30Z

What changes were proposed in this pull request?

This PR is to backport #15975 to Branch 2.1

The following two DataFrameReader JDBC APIs ignore the user-specified parameters of parallelism degree.

  def jdbc(
      url: String,
      table: String,
      columnName: String,
      lowerBound: Long,
      upperBound: Long,
      numPartitions: Int,
      connectionProperties: Properties): DataFrame

  def jdbc(
      url: String,
      table: String,
      predicates: Array[String],
      connectionProperties: Properties): DataFrame

This PR is to fix the issues. To verify the behavior correctness, we improve the plan output of EXPLAIN command by adding numPartitions in the JDBCRelation node.

Before the fix,

== Physical Plan ==
*Scan JDBCRelation(TEST.PEOPLE) [NAME#1896,THEID#1897] ReadSchema: struct<NAME:string,THEID:int>

After the fix,

== Physical Plan ==
*Scan JDBCRelation(TEST.PEOPLE) [numPartitions=3] [NAME#1896,THEID#1897] ReadSchema: struct<NAME:string,THEID:int>

How was this patch tested?

Added the verification logics on all the test cases for JDBC concurrent fetching.

cloud-fan · 2016-12-02T01:04:04Z

LGTM, pending jenkins

SparkQA · 2016-12-02T02:49:16Z

Test build #69518 has finished for PR 16111 at commit 170568d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2016-12-02T03:15:52Z

thanks, merging to 2.1!

…DataFrameReader JDBC APIs ### What changes were proposed in this pull request? #### This PR is to backport #15975 to Branch 2.1 --- The following two `DataFrameReader` JDBC APIs ignore the user-specified parameters of parallelism degree. ```Scala def jdbc( url: String, table: String, columnName: String, lowerBound: Long, upperBound: Long, numPartitions: Int, connectionProperties: Properties): DataFrame ``` ```Scala def jdbc( url: String, table: String, predicates: Array[String], connectionProperties: Properties): DataFrame ``` This PR is to fix the issues. To verify the behavior correctness, we improve the plan output of `EXPLAIN` command by adding `numPartitions` in the `JDBCRelation` node. Before the fix, ``` == Physical Plan == *Scan JDBCRelation(TEST.PEOPLE) [NAME#1896,THEID#1897] ReadSchema: struct<NAME:string,THEID:int> ``` After the fix, ``` == Physical Plan == *Scan JDBCRelation(TEST.PEOPLE) [numPartitions=3] [NAME#1896,THEID#1897] ReadSchema: struct<NAME:string,THEID:int> ``` ### How was this patch tested? Added the verification logics on all the test cases for JDBC concurrent fetching. Author: gatorsmile <gatorsmile@gmail.com> Closes #16111 from gatorsmile/jdbcFix2.1.

gatorsmile · 2016-12-02T03:16:47Z

Thanks!

fix.

170568d

gatorsmile closed this Dec 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-18538] [SQL] [Backport-2.1] Fix Concurrent Table Fetching Using DataFrameReader JDBC APIs#16111

[SPARK-18538] [SQL] [Backport-2.1] Fix Concurrent Table Fetching Using DataFrameReader JDBC APIs#16111
gatorsmile wants to merge 1 commit intoapache:branch-2.1from
gatorsmile:jdbcFix2.1

gatorsmile commented Dec 1, 2016

Uh oh!

cloud-fan commented Dec 2, 2016

Uh oh!

SparkQA commented Dec 2, 2016

Uh oh!

cloud-fan commented Dec 2, 2016

Uh oh!

gatorsmile commented Dec 2, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gatorsmile commented Dec 1, 2016

What changes were proposed in this pull request?

This PR is to backport #15975 to Branch 2.1

How was this patch tested?

Uh oh!

cloud-fan commented Dec 2, 2016

Uh oh!

SparkQA commented Dec 2, 2016

Uh oh!

cloud-fan commented Dec 2, 2016

Uh oh!

gatorsmile commented Dec 2, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants