-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-51187][SQL][SS][3.5] Implement the graceful deprecation of incorrect config introduced in SPARK-49699 #49985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for making this PR, @HeartSaVioR . I sent an email for further discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM for Apache Spark 3.5.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, could you the old config to deprecatedSQLConfigs
, @HeartSaVioR , please?
val deprecatedSQLConfigs: Map[String, DeprecatedConfig] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vote passed. Could you address the above review comment, @HeartSaVioR ?
@dongjoon-hyun I've pushed the commit 673f789 to reflect your feedback. Would you mind having a review again? Thanks! |
Thank you! Sure, @HeartSaVioR . |
@@ -4480,7 +4481,9 @@ object SQLConf { | |||
DeprecatedConfig(LEGACY_REPLACE_DATABRICKS_SPARK_AVRO_ENABLED.key, "3.2", | |||
"""Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead."""), | |||
DeprecatedConfig(COALESCE_PARTITIONS_MIN_PARTITION_NUM.key, "3.2", | |||
s"Use '${COALESCE_PARTITIONS_MIN_PARTITION_SIZE.key}' instead.") | |||
s"Use '${COALESCE_PARTITIONS_MIN_PARTITION_SIZE.key}' instead."), | |||
DeprecatedConfig(PRUNE_FILTERS_CAN_PRUNE_STREAMING_SUBPLAN.alternatives.head, "3.5.4", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the definition, this should be 3.5.5
.
Version of Spark where key was deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry was confused. Nice finding!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, it looks good to me. While waiting for CIs, let me verify it manually too.
I verified the deprecation warning messages. Thank you again.
|
The very previous commit passed the CI and the last commit is only to change the version number string to Let me merge this. |
…orrect config introduced in SPARK-49699 ### What changes were proposed in this pull request? This PR proposes to implement the graceful deprecation of incorrect config introduced in SPARK-49699. SPARK-49699 was included in Spark 3.5.4, hence we can't simply rename to fix the issue. Also, since the incorrect config is logged in offset log in streaming query, the fix isn't just easy like adding withAlternative and done. We need to manually handle the case where offset log contains the incorrect config, and set the value of incorrect config in the offset log into the new config. Once a single microbatch has planned after the restart (hence the above logic is applied), offset log will contain the "new" config and it will no longer refer to the incorrect config. That said, we can remove the incorrect config in the Spark version which we are confident that there will be no case users will upgrade from Spark 3.5.4 to that version. ### Why are the changes needed? We released an incorrect config and we want to rename it properly. While renaming, we don't also want to have any breakage on the existing streaming query. ### Does this PR introduce _any_ user-facing change? No. That is what this PR is aiming for. ### How was this patch tested? New UT. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49985 from HeartSaVioR/SPARK-51187-3.5. Authored-by: Jungtaek Lim <kabhwan.opensource@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Merged to branch-3.5. |
Thank you, @HeartSaVioR . |
What changes were proposed in this pull request?
This PR proposes to implement the graceful deprecation of incorrect config introduced in SPARK-49699.
SPARK-49699 was included in Spark 3.5.4, hence we can't simply rename to fix the issue.
Also, since the incorrect config is logged in offset log in streaming query, the fix isn't just easy like adding withAlternative and done. We need to manually handle the case where offset log contains the incorrect config, and set the value of incorrect config in the offset log into the new config. Once a single microbatch has planned after the restart (hence the above logic is applied), offset log will contain the "new" config and it will no longer refer to the incorrect config.
That said, we can remove the incorrect config in the Spark version which we are confident that there will be no case users will upgrade from Spark 3.5.4 to that version.
Why are the changes needed?
We released an incorrect config and we want to rename it properly. While renaming, we don't also want to have any breakage on the existing streaming query.
Does this PR introduce any user-facing change?
No. That is what this PR is aiming for.
How was this patch tested?
New UT.
Was this patch authored or co-authored using generative AI tooling?
No.