Skip to content

[SPARK-20557] [SQL] Only support TIMESTAMP WITH TIME ZONE for Oracle Dialect #19939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Dec 10, 2017

What changes were proposed in this pull request?

In the previous PRs, #17832 and #17835 , we convert TIMESTAMP WITH TIME ZONE and TIME WITH TIME ZONE to TIMESTAMP for all the JDBC sources. However, this conversion could be risky since it does not respect our SQL configuration spark.sql.session.timeZone.

In addition, each vendor might have different semantics for these two types. For example, Postgres simply returns TIMESTAMP types for TIMESTAMP WITH TIME ZONE. For such supports, we should do it case by case. This PR reverts the general support of TIMESTAMP WITH TIME ZONE and TIME WITH TIME ZONE for JDBC sources, except ORACLE Dialect.

When supporting the ORACLE's TIMESTAMP WITH TIME ZONE, we only support it when the JVM default timezone is the same as the user-specified configuration spark.sql.session.timeZone (whose default is the JVM default timezone). Now, we still treat TIMESTAMP WITH TIME ZONE as TIMESTAMP when fetching the values via the Oracle JDBC connector, whose client converts the timestamp values with time zone to the timestamp values using the local JVM default timezone (a test case is added to OracleIntegrationSuite.scala in this PR for showing the behavior). Thus, to avoid any future behavior change, we will not support it if JVM default timezone is different from spark.sql.session.timeZone

No regression because the previous two PRs were just merged to be unreleased master branch.

How was this patch tested?

Added the test cases

@gatorsmile
Copy link
Member Author

gatorsmile commented Dec 10, 2017

@SparkQA
Copy link

SparkQA commented Dec 10, 2017

Test build #84692 has finished for PR 19939 at commit 69c6e3e.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Dec 10, 2017

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Dec 10, 2017

Test build #84693 has finished for PR 19939 at commit 69c6e3e.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 10, 2017

Test build #84697 has finished for PR 19939 at commit 69c6e3e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

val dfRead = sqlContext.read.jdbc(jdbcUrl, "ts_with_timezone", new Properties)
withTimeZone("PST") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you make sure sesson time zone is same with JVM time zone?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see, session time zone is dynamically evaluated with JVM timezone if not set, so we need to guarantee that session time zone is not set here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's safer if we also set session local time zone in withTimeZone

@cloud-fan
Copy link
Contributor

LGTM

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM, too.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 11, 2017

Test build #84722 has finished for PR 19939 at commit 7cbd991.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -151,6 +151,8 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite {

test("SPARK-20557: column type TIMESTAMP with TIME ZONE and TIME with TIME ZONE " +
"should be recognized") {
// When using JDBC to read the columns of TIMESTAMP with TIME ZONE and TIME with TIME ZONE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is for the support of ArrayType. We can revisit that one in the next PR.

@SparkQA
Copy link

SparkQA commented Dec 12, 2017

Test build #84731 has finished for PR 19939 at commit 7cbd991.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

Since the test failure in SparkR is not related to this PR, I merge this to master.

Thanks for the review!

@asfgit asfgit closed this in a400265 Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants