Skip to content

Commit

Permalink
[SPARK-23549][SQL] Rename config spark.sql.legacy.compareDateTimestam…
Browse files Browse the repository at this point in the history
…pInTimestamp

## What changes were proposed in this pull request?
See title. Makes our legacy backward compatibility configs more consistent.

## How was this patch tested?
Make sure all references have been updated:
```
> git grep compareDateTimestampInTimestamp
docs/sql-programming-guide.md:  - Since Spark 2.4, Spark compares a DATE type with a TIMESTAMP type after promotes both sides to TIMESTAMP. To set `false` to `spark.sql.legacy.compareDateTimestampInTimestamp` restores the previous behavior. This option will be removed in Spark 3.0.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:    // if conf.compareDateTimestampInTimestamp is true
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:      => if (conf.compareDateTimestampInTimestamp) Some(TimestampType) else Some(StringType)
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:      => if (conf.compareDateTimestampInTimestamp) Some(TimestampType) else Some(StringType)
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:    buildConf("spark.sql.legacy.compareDateTimestampInTimestamp")
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:  def compareDateTimestampInTimestamp : Boolean = getConf(COMPARE_DATE_TIMESTAMP_IN_TIMESTAMP)
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala:        "spark.sql.legacy.compareDateTimestampInTimestamp" -> convertToTS.toString) {
```

Closes apache#22508 from rxin/SPARK-23549.

Authored-by: Reynold Xin <rxin@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
rxin authored and cloud-fan committed Sep 21, 2018
1 parent fb3276a commit 411ecc3
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 20 deletions.
2 changes: 1 addition & 1 deletion docs/sql-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -1950,7 +1950,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see
- Since Spark 2.4, writing an empty dataframe to a directory launches at least one write task, even if physically the dataframe has no partition. This introduces a small behavior change that for self-describing file formats like Parquet and Orc, Spark creates a metadata-only file in the target directory when writing a 0-partition dataframe, so that schema inference can still work if users read that directory later. The new behavior is more reasonable and more consistent regarding writing empty dataframe.
- Since Spark 2.4, expression IDs in UDF arguments do not appear in column names. For example, an column name in Spark 2.4 is not `UDF:f(col0 AS colA#28)` but ``UDF:f(col0 AS `colA`)``.
- Since Spark 2.4, writing a dataframe with an empty or nested empty schema using any file formats (parquet, orc, json, text, csv etc.) is not allowed. An exception is thrown when attempting to write dataframes with empty schema.
- Since Spark 2.4, Spark compares a DATE type with a TIMESTAMP type after promotes both sides to TIMESTAMP. To set `false` to `spark.sql.hive.compareDateTimestampInTimestamp` restores the previous behavior. This option will be removed in Spark 3.0.
- Since Spark 2.4, Spark compares a DATE type with a TIMESTAMP type after promotes both sides to TIMESTAMP. To set `false` to `spark.sql.legacy.compareDateTimestampInTimestamp` restores the previous behavior. This option will be removed in Spark 3.0.
- Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set `true` to `spark.sql.allowCreatingManagedTableUsingNonemptyLocation` restores the previous behavior. This option will be removed in Spark 3.0.
- Since Spark 2.4, renaming a managed table to existing location is not allowed. An exception is thrown when attempting to rename a managed table to existing location.
- Since Spark 2.4, the type coercion rules can automatically promote the argument types of the variadic SQL functions (e.g., IN/COALESCE) to the widest common type, no matter how the input arguments order. In prior Spark versions, the promotion could fail in some specific orders (e.g., TimestampType, IntegerType and StringType) and throw an exception.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -577,16 +577,6 @@ object SQLConf {
.checkValues(HiveCaseSensitiveInferenceMode.values.map(_.toString))
.createWithDefault(HiveCaseSensitiveInferenceMode.INFER_AND_SAVE.toString)

val TYPECOERCION_COMPARE_DATE_TIMESTAMP_IN_TIMESTAMP =
buildConf("spark.sql.typeCoercion.compareDateTimestampInTimestamp")
.internal()
.doc("When true (default), compare Date with Timestamp after converting both sides to " +
"Timestamp. This behavior is compatible with Hive 2.2 or later. See HIVE-15236. " +
"When false, restore the behavior prior to Spark 2.4. Compare Date with Timestamp after " +
"converting both sides to string. This config will be removed in spark 3.0")
.booleanConf
.createWithDefault(true)

val OPTIMIZER_METADATA_ONLY = buildConf("spark.sql.optimizer.metadataOnly")
.doc("When true, enable the metadata-only query optimization that use the table's metadata " +
"to produce the partition columns instead of table scans. It applies when all the columns " +
Expand Down Expand Up @@ -1476,12 +1466,6 @@ object SQLConf {
.booleanConf
.createWithDefault(true)

val LEGACY_SIZE_OF_NULL = buildConf("spark.sql.legacy.sizeOfNull")
.doc("If it is set to true, size of null returns -1. This behavior was inherited from Hive. " +
"The size function returns null for null input if the flag is disabled.")
.booleanConf
.createWithDefault(true)

val REPL_EAGER_EVAL_ENABLED = buildConf("spark.sql.repl.eagerEval.enabled")
.doc("Enables eager evaluation or not. When true, the top K rows of Dataset will be " +
"displayed if and only if the REPL supports the eager evaluation. Currently, the " +
Expand Down Expand Up @@ -1531,6 +1515,22 @@ object SQLConf {
.checkValues((1 to 9).toSet + Deflater.DEFAULT_COMPRESSION)
.createWithDefault(Deflater.DEFAULT_COMPRESSION)

val COMPARE_DATE_TIMESTAMP_IN_TIMESTAMP =
buildConf("spark.sql.legacy.compareDateTimestampInTimestamp")
.internal()
.doc("When true (default), compare Date with Timestamp after converting both sides to " +
"Timestamp. This behavior is compatible with Hive 2.2 or later. See HIVE-15236. " +
"When false, restore the behavior prior to Spark 2.4. Compare Date with Timestamp after " +
"converting both sides to string. This config will be removed in Spark 3.0.")
.booleanConf
.createWithDefault(true)

val LEGACY_SIZE_OF_NULL = buildConf("spark.sql.legacy.sizeOfNull")
.doc("If it is set to true, size of null returns -1. This behavior was inherited from Hive. " +
"The size function returns null for null input if the flag is disabled.")
.booleanConf
.createWithDefault(true)

val LEGACY_REPLACE_DATABRICKS_SPARK_AVRO_ENABLED =
buildConf("spark.sql.legacy.replaceDatabricksSparkAvro.enabled")
.doc("If it is set to true, the data source provider com.databricks.spark.avro is mapped " +
Expand Down Expand Up @@ -1691,8 +1691,7 @@ class SQLConf extends Serializable with Logging {
def caseSensitiveInferenceMode: HiveCaseSensitiveInferenceMode.Value =
HiveCaseSensitiveInferenceMode.withName(getConf(HIVE_CASE_SENSITIVE_INFERENCE))

def compareDateTimestampInTimestamp : Boolean =
getConf(TYPECOERCION_COMPARE_DATE_TIMESTAMP_IN_TIMESTAMP)
def compareDateTimestampInTimestamp : Boolean = getConf(COMPARE_DATE_TIMESTAMP_IN_TIMESTAMP)

def gatherFastStats: Boolean = getConf(GATHER_FASTSTAT)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1459,7 +1459,7 @@ class TypeCoercionSuite extends AnalysisTest {
DoubleType)))
Seq(true, false).foreach { convertToTS =>
withSQLConf(
"spark.sql.typeCoercion.compareDateTimestampInTimestamp" -> convertToTS.toString) {
"spark.sql.legacy.compareDateTimestampInTimestamp" -> convertToTS.toString) {
val date0301 = Literal(java.sql.Date.valueOf("2017-03-01"))
val timestamp0301000000 = Literal(Timestamp.valueOf("2017-03-01 00:00:00"))
val timestamp0301000001 = Literal(Timestamp.valueOf("2017-03-01 00:00:01"))
Expand Down

0 comments on commit 411ecc3

Please sign in to comment.