Skip to content

Commit 7bb45da

Browse files
committed
Note that 'dir/*' can be more efficient in some Hadoop FS implementations that 'dir/' (now fixed scaladoc by using HTML entity for *)
1 parent 5452457 commit 7bb45da

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

core/src/main/scala/org/apache/spark/SparkContext.scala

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -831,7 +831,8 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
831831
* }}}
832832
*
833833
* @note Small files are preferred, large file is also allowable, but may cause bad performance.
834-
*
834+
* @note On some filesystems, `.../path/*` can be a more efficient way to read all files
835+
* in a directory rather than `.../path/` or `.../path`
835836
* @param minPartitions A suggestion value of the minimal splitting number for input data.
836837
*/
837838
def wholeTextFiles(
@@ -878,9 +879,10 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
878879
* (a-hdfs-path/part-nnnnn, its content)
879880
* }}}
880881
*
881-
* @param minPartitions A suggestion value of the minimal splitting number for input data.
882-
*
883882
* @note Small files are preferred; very large files may cause bad performance.
883+
* @note On some filesystems, `.../path/*` can be a more efficient way to read all files
884+
* in a directory rather than `.../path/` or `.../path`
885+
* @param minPartitions A suggestion value of the minimal splitting number for input data.
884886
*/
885887
@Experimental
886888
def binaryFiles(

0 commit comments

Comments
 (0)