Skip to content

Commit 70dd9c0

Browse files
turboFeiHyukjinKwon
authored andcommitted
[SPARK-29542][SQL][DOC] Make the descriptions of spark.sql.files.* be clearly
### What changes were proposed in this pull request? As described in [SPARK-29542](https://issues.apache.org/jira/browse/SPARK-29542) , the descriptions of `spark.sql.files.*` are confused. In this PR, I make their descriptions be clearly. ### Why are the changes needed? It makes the descriptions of `spark.sql.files.*` be clearly. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing UT. Closes #26200 from turboFei/SPARK-29542-partition-maxSize. Authored-by: turbofei <fwang12@ebay.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
1 parent cbe6ead commit 70dd9c0

File tree

1 file changed

+11
-4
lines changed
  • sql/catalyst/src/main/scala/org/apache/spark/sql/internal

1 file changed

+11
-4
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -980,7 +980,9 @@ object SQLConf {
980980
.createWithDefault(true)
981981

982982
val FILES_MAX_PARTITION_BYTES = buildConf("spark.sql.files.maxPartitionBytes")
983-
.doc("The maximum number of bytes to pack into a single partition when reading files.")
983+
.doc("The maximum number of bytes to pack into a single partition when reading files. " +
984+
"This configuration is effective only when using file-based sources such as Parquet, JSON " +
985+
"and ORC.")
984986
.bytesConf(ByteUnit.BYTE)
985987
.createWithDefault(128 * 1024 * 1024) // parquet.block.size
986988

@@ -989,19 +991,24 @@ object SQLConf {
989991
.doc("The estimated cost to open a file, measured by the number of bytes could be scanned in" +
990992
" the same time. This is used when putting multiple files into a partition. It's better to" +
991993
" over estimated, then the partitions with small files will be faster than partitions with" +
992-
" bigger files (which is scheduled first).")
994+
" bigger files (which is scheduled first). This configuration is effective only when using" +
995+
" file-based sources such as Parquet, JSON and ORC.")
993996
.longConf
994997
.createWithDefault(4 * 1024 * 1024)
995998

996999
val IGNORE_CORRUPT_FILES = buildConf("spark.sql.files.ignoreCorruptFiles")
9971000
.doc("Whether to ignore corrupt files. If true, the Spark jobs will continue to run when " +
998-
"encountering corrupted files and the contents that have been read will still be returned.")
1001+
"encountering corrupted files and the contents that have been read will still be returned. " +
1002+
"This configuration is effective only when using file-based sources such as Parquet, JSON " +
1003+
"and ORC.")
9991004
.booleanConf
10001005
.createWithDefault(false)
10011006

10021007
val IGNORE_MISSING_FILES = buildConf("spark.sql.files.ignoreMissingFiles")
10031008
.doc("Whether to ignore missing files. If true, the Spark jobs will continue to run when " +
1004-
"encountering missing files and the contents that have been read will still be returned.")
1009+
"encountering missing files and the contents that have been read will still be returned. " +
1010+
"This configuration is effective only when using file-based sources such as Parquet, JSON " +
1011+
"and ORC.")
10051012
.booleanConf
10061013
.createWithDefault(false)
10071014

0 commit comments

Comments
 (0)