Skip to content

Commit f1ac35b

Browse files
committed
[SPARK-51049][CORE] Increase S3A Vector IO threshold for range merge
### What changes were proposed in this pull request? This PR aims to increase S3A Vector IO threshold for range merge. ### Why are the changes needed? Apache Spark 4.0.0 supported Hadoop Vectored IO via ORC and Parquet. As a part of [HADOOP-18855 VectorIO API tuning/stabilization](https://issues.apache.org/jira/browse/HADOOP-18855), Apache Hadoop 3.4.2 will have new threshold default values. We had better follow these update in advance until Apache Hadoop 3.4.2 is released. - apache/hadoop#7281 ### Does this PR introduce _any_ user-facing change? No, Hadoop Vectored IO features are new in Apache Spark 4.0.0 . ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49748 from dongjoon-hyun/SPARK-51049. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit b62c3f4) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
1 parent 0df77e6 commit f1ac35b

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

core/src/main/scala/org/apache/spark/SparkContext.scala

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -423,6 +423,10 @@ class SparkContext(config: SparkConf) extends Logging {
423423
if (!_conf.contains("spark.app.name")) {
424424
throw new SparkException("An application name must be set in your configuration")
425425
}
426+
// HADOOP-19229 Vector IO on cloud storage: increase threshold for range merging
427+
// We can remove this after Apache Hadoop 3.4.2 releases
428+
conf.setIfMissing("spark.hadoop.fs.s3a.vectored.read.min.seek.size", "128K")
429+
conf.setIfMissing("spark.hadoop.fs.s3a.vectored.read.max.merged.size", "2M")
426430
// This should be set as early as possible.
427431
SparkContext.fillMissingMagicCommitterConfsIfNeeded(_conf)
428432

0 commit comments

Comments
 (0)