+ - In version 2.3 and earlier, `to_utc_timestamp` and `from_utc_timestamp` respect the timezone in the input timestamp string, which breaks the assumption that the input timestamp is in a specific timezone, and returns weird result. In version 2.4 and later, this problem has been fixed. `to_utc_timestamp` and `from_utc_timestamp` will return null if the input timestamp string contains timezone. As an example, `from_utc_timestamp('2000-10-10 00:00:00', 'GMT+1')` should return `2000-10-10 01:00:00`. If the input timestamp string contains timezone, e.g. `from_utc_timestamp('2000-10-10 00:00:00+00:00', 'GMT+1')`. It returns `2000-10-10 09:00:00` in Spark 2.3(local timezone is GMT+8), and returns null in Spark 2.4. For people who don't care about this problem and want to retain the previous behaivor to keep their query unchanged, you can set `spark.sql.function.rejectTimezoneInString` to false. This option will be removed in Spark 3.0 and should only be used as a temporary workaround.
0 commit comments