Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Aug 11, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.3)

Can you help keep this open source service alive? 💖 Please sponsor : )

urosstan-db and others added 4 commits August 11, 2025 21:45
### What changes were proposed in this pull request?
Log generated JDBC query in JDBC data source to executor log4j logs

### Why are the changes needed?
Observability improvements

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
NA, simple change, we just need to hit this code path and that is covered by other suites (`JDBCSuite`)

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #51877 from urosstan-db/urosstan-db/spark-x-add-log-of-generated-sql-query-to-jdbc-rdd.

Authored-by: Uros Stankovic <155642965+urosstan-db@users.noreply.github.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request?

This PR aims to support `createArray` in `SparkCollectionUtils` to take advantage of `java.util.Arrays.fill()` method which is faster than Scala `Array.fill`'s operation.

Apache Spark uses `Array.fill()` many times.
```
$ git grep Array.fill | wc -l
     530
```

Like the following example, new method is much faster.

https://github.com/apache/spark/blob/d2dc05575294a9515564f62fac6473d9c9e5321e/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala#L870-L873

```scala
scala> spark.time((1 until 1024).map(Array.fill[Byte](5 * 1024 * 1024)('X')).size)
Time taken: 15 ms
val res0: Int = 1023

scala> spark.time((1 until 1024).map(org.apache.spark.util.SparkCollectionUtils.createArray[Byte](5 * 1024 * 1024, 'X')).size)
Time taken: 0 ms
val res1: Int = 1023
```

### Why are the changes needed?

To support a better implementation.

```scala
$ bin/spark-shell --driver-memory 12G
...
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 4.1.0-SNAPSHOT
      /_/

Using Scala version 2.13.16 (OpenJDK 64-Bit Server VM, Java 21.0.8)
...
scala> spark.time(Array.fill[Byte](2_000_000_000)(7))
Time taken: 387 ms

scala> spark.time(org.apache.spark.util.SparkCollectionUtils.createArray[Byte](2_000_000_000, 7))
Time taken: 190 ms
```

### Does this PR introduce _any_ user-facing change?

No, this is a new utility method.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #51968 from dongjoon-hyun/SPARK-53241.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
Remove two obsolete TODO items

### Why are the changes needed?
I believe they are completely obsolete and confusing

### Does this PR introduce _any_ user-facing change?
no, code clean up

### How was this patch tested?
ci

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #51971 from zhengruifeng/remove_two_todo.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…abled confs during view creation

### What changes were proposed in this pull request?
If a `ANALYZER_DUAL_RUN_LEGACY_AND_SINGLE_PASS_RESOLVER` conf is explicitly set during view creation, we store it, but we might not want to. When we query that view, stored value is used, which we forbid in order to avoid any accidental failures. We do the same for `ANALYZER_SINGLE_PASS_RESOLVER_ENABLED_TENTATIVELY`.

### Why are the changes needed?
To make single-pass resolver runs more stable.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
Added tests.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #51972 from mihailoale-db/dontstoreqanconfs.

Authored-by: mihailoale-db <mihailo.aleksic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@pull pull bot locked and limited conversation to collaborators Aug 11, 2025
@pull pull bot added the ⤵️ pull label Aug 11, 2025
@pull pull bot merged commit 1bc8ce0 into huangxiaopingRD:master Aug 11, 2025
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants