File tree Expand file tree Collapse file tree 2 files changed +8
-8
lines changed
sql/core/src/main/java/org/apache/spark/sql/vectorized Expand file tree Collapse file tree 2 files changed +8
-8
lines changed Original file line number Diff line number Diff line change 31
31
*
32
32
* ColumnVector supports all the data types including nested types. To handle nested types,
33
33
* ColumnVector can have children and is a tree structure. For struct type, it stores the actual
34
- * data of each field in the corresponding child ColumnVector, and only store null information in
34
+ * data of each field in the corresponding child ColumnVector, and only stores null information in
35
35
* the parent ColumnVector. For array type, it stores the actual array elements in the child
36
- * ColumnVector, and store null information, array offsets and lengths in the parent ColumnVector.
36
+ * ColumnVector, and stores null information, array offsets and lengths in the parent ColumnVector.
37
37
*
38
38
* ColumnVector is expected to be reused during the entire data loading process, to avoid allocating
39
39
* memory again and again.
40
40
*
41
- * ColumnVector is meant to maximize CPU efficiency but not to minimize storage footprint,
42
- * implementations should prefer computing efficiency over storage efficiency when design the
41
+ * ColumnVector is meant to maximize CPU efficiency but not to minimize storage footprint.
42
+ * Implementations should prefer computing efficiency over storage efficiency when design the
43
43
* format. Since it is expected to reuse the ColumnVector instance while loading data, the storage
44
44
* footprint is negligible.
45
45
*/
Original file line number Diff line number Diff line change 23
23
import org .apache .spark .sql .types .StructType ;
24
24
25
25
/**
26
- * This class is a wrapper of multiple ColumnVectors and represents a logical table-like data
27
- * structure. It provides a row-view of this batch so that Spark can access the data row by row.
28
- * Instance of it is meant to be reused during the entire data loading process.
26
+ * This class wraps multiple ColumnVectors as a row-wise table. It provides a row view of this
27
+ * batch so that Spark can access the data row by row. Instance of it is meant to be reused during
28
+ * the entire data loading process.
29
29
*/
30
30
public final class ColumnarBatch {
31
31
public static final int DEFAULT_BATCH_SIZE = 4 * 1024 ;
@@ -79,7 +79,7 @@ public void remove() {
79
79
}
80
80
81
81
/**
82
- * Sets the number of rows that are valid in this batch.
82
+ * Sets the number of rows in this batch.
83
83
*/
84
84
public void setNumRows (int numRows ) {
85
85
assert (numRows <= this .capacity );
You can’t perform that action at this time.
0 commit comments