Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Dec 2, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

ganeshas-db and others added 5 commits December 1, 2025 11:46
…n filter failures

### What changes were proposed in this pull request?

This PR refactors the error handling for Hive metastore partition filter failures by migrating from the legacy error code _LEGACY_ERROR_TEMP_2193 to a properly defined error condition INTERNAL_ERROR_HIVE_METASTORE_PARTITION_FILTER with SQL state 58030. The error message is restructured to include the underlying exception details.

### Why are the changes needed?

The previous error message was verbose and lacked important diagnostic information. The legacy error code needed to be migrated to a proper error condition with an appropriate SQL state for better error categorization.

### Does this PR introduce _any_ user-facing change?

Yes. Users will see an improved error message that includes the actual exception details and clearer guidance.

### How was this patch tested?

Updated existing unit tests in HivePartitionFilteringSuite and ExternalCatalogSuite to verify the new error condition.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.5

Closes #53212 from ganeshashree/SPARK-54501.

Authored-by: Ganesha S <ganesha.s@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request?

Enabled flake8 [F811](https://www.flake8rules.com/rules/F811.html) check on our repo and fixed reported issues.

### Why are the changes needed?

I know upgrading lint system is a pain, but we should not just put it aside forever. Our pinned `flake8` version is not even usable on Python3.12+.

During this "lint fix", I actually discovered a few real bugs - most of them are silently disabled unittests because there is a test method that has the same name (probably due to copy/paste). I think this result supported the idea that we should take lint more seriously.

About `functions.log`, we got it wrong. It's not because `overload` does not work properly - it's because we have two `log` function in that gigantic file. The former one is [dead](https://app.codecov.io/gh/apache/spark/blob/master/python%2Fpyspark%2Fsql%2Ffunctions%2Fbuiltin.py#L3111). I just removed that one.

Again, I really think we should upgrade our lint system. I'm trying to do it slowly - piece by piece, so that people's daily workflow is not impacted too much.

I hope we can eventually move to a place where all linters are updated and people can be more confident about their changes.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`flake8` test on major directories. CI should give more a comprehensive result.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #53253 from gaogaotiantian/flake8-f811.

Authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…h no BEGIN/END are used

### What changes were proposed in this pull request?
When Exception Handlers which don't have BEGIN-END body are triggered, internal exception `java.util.NoSuchElementException` was thrown instead of executing properly or propagating/raising the new error if it happens in handler.

```
BEGIN
  DECLARE EXIT HANDLER FOR SQLEXCEPTION
    SELECT 1;

  SELECT 1/0;
END
```

### Why are the changes needed?
Code was encountering a bug which throws internal error for what should be valid user code.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
New unit tests in `SqlScriptingExecutionSuite`.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #53271 from miland-db/milan-dankovic_data/fix-no-body-handlers.

Authored-by: Milan Dankovic <milan.dankovic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…rces

### What changes were proposed in this pull request?

Introducing the OffsetMap format to key source progress by source name, as opposed to ordinal in the logical plan

### Why are the changes needed?

These changes are needed in order to enable source evolution on a streaming query (adding, removing, reordering sources) without requiring the user to set a new checkpoint directory

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit tests

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #53123 from ericm-db/offset-map.

Authored-by: ericm-db <eric.marnadi@databricks.com>
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
…th Arrow file format

### What changes were proposed in this pull request?
FPGrowth supports local filesystem

### Why are the changes needed?
to make FPGrowth work with local filesystem

### Does this PR introduce _any_ user-facing change?
yes, FPGrowth will work when local saving mode is one

### How was this patch tested?
updated tests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #53232 from zhengruifeng/local_fs_fpg_with_file.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@pull pull bot locked and limited conversation to collaborators Dec 2, 2025
@pull pull bot added the ⤵️ pull label Dec 2, 2025
@pull pull bot merged commit d4e34f5 into huangxiaopingRD:master Dec 2, 2025
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants