[SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan #49897

HyukjinKwon · 2025-02-12T00:57:07Z

What changes were proposed in this pull request?

This PR is a followup of #48149 that proposes to rename spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan

Why are the changes needed?

For consistent naming.

Does this PR introduce any user-facing change?

No, the main PR has not been released yet.

How was this patch tested?

CI should test it out.

Was this patch authored or co-authored using generative AI tooling?

No.

dongjoon-hyun · 2025-02-12T01:13:56Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -4110,7 +4110,7 @@ object SQLConf {
      .createWithDefault(ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH)

  val PRUNE_FILTERS_CAN_PRUNE_STREAMING_SUBPLAN =
-    buildConf("spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan")
+    buildConf("spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan")


Nice catch!

dongjoon-hyun

+1, LGTM. Thank you, @HyukjinKwon .

dongjoon-hyun

Oh, wait, @HyukjinKwon .

It seems to be too late as a follow-up because it's released already.

dongjoon-hyun · 2025-02-12T01:53:19Z

Could you file a new JIRA issue for this because this should have a fix version 3.5.5?

HyukjinKwon · 2025-02-12T01:59:12Z

uhoh

HyukjinKwon · 2025-02-12T01:59:36Z

sure, will do

HyukjinKwon · 2025-02-12T02:05:52Z

Hmmmm .. @HeartSaVioR WDYT? should we only make this change in master branch alone?

dongjoon-hyun

+1, LGTM again.

BTW, I made a new scalastyle rule to prevent this mistake because we missed this mistake completely at 3.5.4 and almost in 4.0.0, @HyukjinKwon .

[SPARK-51173][TESTS] Add configName Scalastyle rule #49900

HeartSaVioR · 2025-02-12T02:12:02Z

Let's fix in master/4.0 first to avoid making more releases shipping this. We should probably think through how to not break 3.5 itself + upgrading from 3.5 to 4.0+ when fixing in 3.5. If we weren't released this in 3.5 that'd be ideal but...

HyukjinKwon · 2025-02-12T02:23:34Z

Merged to master and branch-4.0.

…treamingSubplan ### What changes were proposed in this pull request? This PR is a followup of #48149 that proposes to rename `spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan` to `spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan` ### Why are the changes needed? For consistent naming. ### Does this PR introduce _any_ user-facing change? No, the main PR has not been released yet. ### How was this patch tested? CI should test it out. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49897 from HyukjinKwon/SPARK-49699. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit decb677) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

### What changes were proposed in this pull request? This PR aims to add `configName` Scalastyle rule to prevent invalid config names. ### Why are the changes needed? To prevent repetitive mistake pattern - #45649 - #48149 - #49121 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Currently, this PR will fail at Scalastyle test because the `master` branch is broken. We can merge this after the following PR. - #49897 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49900 from dongjoon-hyun/SPARK-51173. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

### What changes were proposed in this pull request? This PR aims to add `configName` Scalastyle rule to prevent invalid config names. ### Why are the changes needed? To prevent repetitive mistake pattern - #45649 - #48149 - #49121 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Currently, this PR will fail at Scalastyle test because the `master` branch is broken. We can merge this after the following PR. - #49897 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49900 from dongjoon-hyun/SPARK-51173. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 274dc5e) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

HeartSaVioR · 2025-02-12T03:20:33Z

There are two cases:

The streaming query has started from Spark 3.5.4
The streaming query has started before Spark 3.5.4, and had migrated to Spark 3.5.4

1>
When they start the new query in Spark 3.5.4, there is no offset log to read the static config back from, so the value of config spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan will follow the default value, true.

This will be written back to offset log to ensure this value to be kept on streaming query lifecycle.

When they upgrade to the Spark version which we renamed the config, there is offset log to read the static config back, and there is no entry for spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan, hence we enable backward compatibility mode and put the value false for the config spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan.

This could break the query if the rule impacts the query, because the effectiveness of the fix is flipped.

2>
When they upgrade their existing streaming query from older version to Spark 3.5.4, there is offset log to read the static config back from, and there is no entry for spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan, hence we enable backward compatibility mode and put the value false for the config spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan. So the fix is disabled.

When they further upgrade this query to the Spark version which we renamed the config, same thing applies and the fix is disabled. So no change.

I feel like this is a sort of "dead end" because if we don't want to make the case 1 to break, the only way is to make an alias of config, which enforces us to keep this problematic config name much longer (probably forever, as technically they can jump from Spark 3.5.4 to any future version). The chance to hit this case is not very high, but when they encounter this, they'd need to drop checkpoint and start from the scratch, which is unpleasant.

Not sure how we could inform Spark 3.5.4 users about this - maybe should we mention about this in release note for 3.5.5 / 4.0.0?

Anyway, we should apply the same fix in Spark 3.5 branch line. I can't imagine there is other way around.

…treamingSubplan This PR is a followup of apache#48149 that proposes to rename `spark.databricks.sql.optimizer.pruneFiltersCanPruneStreamingSubplan` to `spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan` For consistent naming. No, the main PR has not been released yet. CI should test it out. No. Closes apache#49897 from HyukjinKwon/SPARK-49699. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

WweiL · 2025-02-12T19:17:57Z

Also a tiny fix if folks can merge
@HeartSaVioR @HyukjinKwon @dongjoon-hyun
#49911

This PR aims to add `configName` Scalastyle rule to prevent invalid config names. To prevent repetitive mistake pattern - #45649 - #48149 - #49121 No. Currently, this PR will fail at Scalastyle test because the `master` branch is broken. We can merge this after the following PR. - #49897 No. Closes #49900 from dongjoon-hyun/SPARK-51173. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 274dc5e) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

HeartSaVioR · 2025-03-17T03:54:10Z

According to https://lists.apache.org/thread/25xct2zv31kh1plpy766crs3bd6go5s6, I'm -0.99 on this PR, and I'm willing to cast -1 on this PR anytime soon, because my comment remains unaddressed for more than 1 month from merging this PR.

#49897 (comment)

Let's fix in master/4.0 first to avoid making more releases shipping this. We should probably think through how to not break 3.5 itself + upgrading from 3.5 to 4.0+ when fixing in 3.5. If we weren't released this in 3.5 that'd be ideal but...

I got thumbs up, so this is not something I only think this is critical. There are argument that "users can upgrade to Spark 3.5.5 before Spark 4.0.0", which I do not believe I agree with. Since both are casting -1 for each other's proposal, I think my comment is NOT addressed.

I'm waiting for feedback in dev@, but if we fail to make consensus on the approach on migration, I really have to cast -1 and revert this PR in master/4.0.

HeartSaVioR · 2025-03-17T04:04:28Z

Here the meaning of "my comment is NOT addressed" is,

My proposal is stuck on VETO.
Dongjoon's proposal is never discussed in public.

So we never have a way to address my question "We should probably think through how to not break 3.5 itself + upgrading from 3.5 to 4.0+ when fixing in 3.5." which got community's consensus.

This is precondition of merging this PR and I hope we quickly address the issue, hence the migration logic came up, but...

followup

2e544ff

github-actions bot added the SQL label Feb 12, 2025

dongjoon-hyun reviewed Feb 12, 2025

View reviewed changes

dongjoon-hyun approved these changes Feb 12, 2025

View reviewed changes

dongjoon-hyun requested changes Feb 12, 2025

View reviewed changes

HyukjinKwon changed the title ~~[SPARK-49699][SS][FOLLOW-UP] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan~~ [SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan Feb 12, 2025

HyukjinKwon force-pushed the SPARK-49699 branch from 71a09c9 to 2e544ff Compare February 12, 2025 02:04

dongjoon-hyun mentioned this pull request Feb 12, 2025

[SPARK-51173][TESTS] Add configName Scalastyle rule #49900

Closed

dongjoon-hyun approved these changes Feb 12, 2025

View reviewed changes

HyukjinKwon closed this in decb677 Feb 12, 2025

WweiL mentioned this pull request Feb 12, 2025

[MINOR][SQL][TESTS] Remove redundant space at PropagateEmptyRelationSuite class definition #49911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan #49897

[SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan #49897

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

dongjoon-hyun Feb 12, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun commented Feb 12, 2025

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

HeartSaVioR commented Feb 12, 2025 •

edited

Loading

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

HeartSaVioR commented Feb 12, 2025 •

edited

Loading

Uh oh!

WweiL commented Feb 12, 2025

Uh oh!

HeartSaVioR commented Mar 17, 2025 •

edited

Loading

Uh oh!

HeartSaVioR commented Mar 17, 2025

Uh oh!

Uh oh!

[SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan #49897

[SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneStreamingSubplan #49897

Uh oh!

Conversation

HyukjinKwon commented Feb 12, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Feb 12, 2025

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Feb 12, 2025

Uh oh!

HeartSaVioR commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WweiL commented Feb 12, 2025

Uh oh!

HeartSaVioR commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HeartSaVioR commented Mar 17, 2025

Uh oh!

Uh oh!

HeartSaVioR commented Feb 12, 2025 •

edited

Loading

HeartSaVioR commented Feb 12, 2025 •

edited

Loading

HeartSaVioR commented Mar 17, 2025 •

edited

Loading