column major segment build for columnar datasource by rohityadav1993 · Pull Request #16727 · apache/pinot

rohityadav1993 · 2025-09-01T08:28:38Z

feature performance release-notes

Feature

Feature to build segment in columnar fashion for columnar input data sources like Pinot segment, Parquet files, etc.

PR split from previous previous abandoned PR: #16438 based on feedback.

Background

The requirement arises from issue: #16461:
In short, when minion refreshes a servers segment for schema or index config change. It rebuilds segment row wise. There are two bottlenecks observed:

Row wise segment stats collection for each column in SegmentIndexCreationDriverImpl

pinot/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

Line 573 in 868e2c9

void collectStatsAndIndexCreationInfo()
Row wise segment generation in SegmentIndexCreationDriverImpl

pinot/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

Line 266 in 868e2c9

while (_recordReader.hasNext()) {

Record wise generation also reprocesses transformations. This is not needed unless there is a change in transformation logic both in gather stats step and segment regeneration step. The bottlenecks can be seen in the CPU profile in the attached issue #16461

Design and Implementation

Changes

Interface ColumnReader: used to read column data from various data sources for columnar segment building
1. Sequential iteration over all values in a column using hasNext() and next()
2. Null value detection for the current value
3. Impl: PinotSegmentColumnReaderImpl
Interface ColumnReaderFactory: create column readers for various data sources such as Pinot segments, Parquet files, etc
1. Creating column readers for existing columns
2. Creating default value readers for new columns
3. Data type conversions between source and target schemas
4. Impl: PinotSegmentColumnReaderFactory
ColumnarSegmentCreationDataSource: Columnar datasource and provides wrapper for columnReaders to be used in SegmentIndexCreationDriverImpl
ColumnarSegmentPreIndexStatsContainer: stats collector for columnar datasouce to gatherStats in column manner
Existing buildByColumn method is untouched as it is solely concerned with RealtimeSegmentConverter
Handles new column addition using a new DefaultValueColumnReader abstraction and data type change by reusing

pinot/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java

Line 70 in 3211115

public void transform(GenericRow record) {

Tests

Tests for comparing data correctness for columnar build with scnearios: no change, column addition, data type change, index update.

Performance benchmark

Script: c7e128a#diff-ee55ab7e1bad4347b27fd9721b5cf253c25aa80347147242238a88e7601abfcf

Benchmark for 50 columns(25 dim, 25 metric), 1000000 rows, 3 inverted index

=== BENCHMARK RESULTS ===
Scenario             | Row-Major(ms) | Column-Major(ms) | Difference(ms) | Improvement(%)
------------------------------------------------------------------------------
data_type_change     | 49555        | 19208        | 30347        | 61.24       %
index_update         | 49958        | 19834        | 30124        | 60.30       %
new_column           | 50155        | 18989        | 31166        | 62.14       %
no_change            | 50554        | 20226        | 30328        | 59.99       %

Usage

Implement ColumnReaderFactory based on datasource. e.g. PinotSegmentColumnReaderFactory is implemented in the current PR. This can be used in Minion tasks e.g. a follow up PR(3fcac1d) will add a config in RefreshSegmentMinion Task to build in column major order.
Use it to load data from external sources e.g. Parquet files. This is unimplemented and can be added in a follow up PR. The implemetation should handle columnar reads and provide handling new column present in table, data type different in table from source data.
There will be no handling for transformation as it is a row transform operation which is not applicable in this case.

codecov-commenter · 2025-09-01T09:16:33Z

Codecov Report

❌ Patch coverage is 75.00000% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.25%. Comparing base (f3425c9) to head (bcf6997).
⚠️ Report is 63 commits behind head on master.

Files with missing lines	Patch %	Lines
...t/creator/impl/SegmentIndexCreationDriverImpl.java	71.64%	15 Missing and 4 partials ⚠️
...l/stats/ColumnarSegmentPreIndexStatsContainer.java	71.42%	13 Missing and 3 partials ⚠️
...gment/readers/PinotSegmentColumnReaderFactory.java	75.00%	7 Missing and 5 partials ⚠️
.../segment/readers/PinotSegmentColumnReaderImpl.java	69.56%	4 Missing and 3 partials ⚠️
...ment/creator/impl/SegmentColumnarIndexCreator.java	71.42%	5 Missing and 1 partial ⚠️
...segment/creator/impl/stats/StatsCollectorUtil.java	80.00%	1 Missing and 1 partial ⚠️
...ocal/segment/readers/DefaultValueColumnReader.java	88.88%	1 Missing and 1 partial ⚠️
...ent/creator/ColumnarSegmentCreationDataSource.java	92.85%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #16727      +/-   ##
============================================
+ Coverage     63.23%   63.25%   +0.01%     
+ Complexity     1429     1428       -1     
============================================
  Files          3104     3110       +6     
  Lines        183349   183718     +369     
  Branches      28096    28147      +51     
============================================
+ Hits         115942   116206     +264     
- Misses        58447    58535      +88     
- Partials       8960     8977      +17

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-11	`63.18% <75.00%> (-0.01%)`	⬇️
java-21	`63.20% <75.00%> (-0.01%)`	⬇️
temurin	`63.25% <75.00%> (+0.01%)`	⬆️
unittests	`63.24% <75.00%> (+0.01%)`	⬆️
unittests1	`56.10% <6.92%> (-0.16%)`	⬇️
unittests2	`33.68% <74.61%> (+0.09%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

rohityadav1993 · 2025-09-05T08:56:21Z

@Jackie-Jiang, could you review or assign somebody to review this.

tarun11Mavani

Added few comments.

tarun11Mavani · 2025-09-07T15:54:36Z

...in/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java

+   * @param columnReader ColumnReader for the column data
+   * @throws IOException if indexing fails
+   */
+  public void indexColumn(String columnName, ColumnReader columnReader) throws IOException {


Can add this method signature to the interface SegmentCreator.

tarun11Mavani · 2025-09-17T06:07:09Z

...ain/java/org/apache/pinot/segment/local/segment/readers/PinotSegmentColumnReaderFactory.java

+
+    ColumnReader columnReader;
+
+    if (hasColumn(columnName)) {


We are skipping VirtualColumns in getAllColumnReaders but this can still be called for virtual column from somewhere else. We should skip VirtualColumns here as well.

makes sense, adding an error scenario if invoked for virtual columns

...c/main/java/org/apache/pinot/segment/local/segment/readers/PinotSegmentColumnReaderImpl.java

pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/ColumnReader.java

...ocal/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java

pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/ColumnReader.java

Jackie-Jiang · 2025-09-17T20:24:02Z

pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/ColumnReader.java

+ * <p>Implementations should handle data type conversions, default values for new columns,
+ * and efficient column-wise data access patterns.
+ */
+public interface ColumnReader extends Closeable, Serializable {


This is mostly useful when reading from a column major file format such as Parquet. Can you check if this interface can be used for Parquet reader?

The interface looks good, but one will need to create a ParquetSegmentColumnReaderFactory similar to PinotSegmentColumnReaderFactory which should be able to correctly initialise a ParquetSegmentColumnReaderImpl for a given parquet file to be read.

On a high level Parquet files are analogous to a Pinot segment as collection of columns and there is a similar low level columnar reader: https://github.com/apache/parquet-java/blob/f50dd6cb4b526cf4b585993c1b69a838cd8151f3/parquet-column/src/main/java/org/apache/parquet/column/ColumnReader.java#L40

I do plan to implement this in future, but not an immediate priority. I can create an issue for this as follow up.

Jackie-Jiang · 2025-09-17T20:24:53Z

pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/ColumnReader.java

+   *
+   * @return true if the current value is null, false otherwise
+   */
+  boolean isNull();


Why do we need this when next() can return null?

Can be handled in next(), let me update. Originally, idea was to add custom null check handling in isNull(), e.g. "null" string --> null

...c/main/java/org/apache/pinot/segment/local/segment/readers/PinotSegmentColumnReaderImpl.java

tarun11Mavani

LGTM

rohityadav1993 · 2025-09-29T07:16:02Z

Offline discussion with @Jackie-Jiang :

Remove any transformation including data type which is not part of schema/index update rather ingestion transformation.
Follow up: Build columnar ingestion transformation and later improve columnar build to support transformation. Relates to requirements in Pinot and Arrow Integration #16643

rohityadav1993 · 2025-09-29T07:16:13Z

@Jackie-Jiang, this is ready to review.

Jackie-Jiang

The overall approach looks good!

@KKcorps @noob-se7en Can you help review this?

Jackie-Jiang · 2025-10-06T23:22:20Z

...n/java/org/apache/pinot/segment/local/segment/creator/ColumnarSegmentCreationDataSource.java

+   * @param columnReaders Map of column name to ColumnReader instances
+   */
+    public ColumnarSegmentCreationDataSource(ColumnReaderFactory columnReaderFactory,
+                                          Map<String, ColumnReader> columnReaders) {


(code format) Can you reformat all the changes with Pinot Style

noob-se7en

mostly looks good, Added some minor comments. Will take another look tomorrow.

pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/ColumnReaderFactory.java

...ain/java/org/apache/pinot/segment/local/segment/readers/PinotSegmentColumnReaderFactory.java

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

krishan1390 · 2025-10-13T05:21:09Z

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

+    _columnStatsCollectorMap = new HashMap<>();
+    _totalDocCount = -1; // indicates unset
+
+    initializeStatsCollectors();


initializeStatsCollectors and collectColumnStats needn't be called in constructor right ?

initializeStatsCollectors should be called in init() method ?

and similarly collectColumnStats in build() method ?

There isn't a requirement for lazy loading(initializeStatsCollectors) by invoking init separately.
Column stats are collected in build method, buildColumnar()

Its cleaner and more easier to maintain to keep constructor light and have the methods do what they're supposed to.

updating, I don't see any benefit in delaying the initialization by calling it immediately after object creation. These are not service start up objects but more to be used in the segment generation flow where there is not much benefit by split contructor logic into init method and calling it immediately.

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

krishan1390 · 2025-10-13T05:24:55Z

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

+        continue;
+      }
+
+      String columnName = fieldSpec.getName();


we can probably abstract the common logic in SegmentPreIndexStatsCollectorImpl and this class to populate _columnStatsCollectorMap to avoid redundant code.

Also SegmentPreIndexStatsCollectorImpl has added support for NoDictColumnStatisticsCollector which needs to be added here too - https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/stats/SegmentPreIndexStatsCollectorImpl.java#L61

SegmentPreIndexStatsCollectorImpl is heavily assumes row major iteration using method collectRow()

We are now using buildByColumn as default for mutable segment to immutable segment conversion and this is a new addition buildColumnar for immutable datasource to immutableSegment conversion in column major order so the above should get phased out after we are completely columnar with support for transformations.

yes. but avoiding redundant code will be helpful to maintain the code until it exists and keep them consistent for common logic. eg: NoDictColumnStatisticsCollector was missing in this class.

@rohityadav1993 we should do this change before merging the PR

cc @KKcorps @noob-se7en

Aligned on reducing redundancy, extracting the common logic in a class called StatsCollectorUtils.

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

krishan1390 · 2025-10-13T05:27:46Z

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java

+
+  @Override
+  public void logStats() {
+    LOGGER.info("Columnar segment stats collection completed for {} columns with {} documents",


we should log the full stats here. same comment as earlier to refactor common code of this class and SegmentPreIndexStatsCollectorImpl to get same benefits without duplicate code.

column stats get persisted as segment metadata in segment file.

segment metadata won't be available on minion hosts. and might be updated on server if the segment is refreshed multiple times. and will be unavailable if segment is deleted ,etc

for retrospective debugging in such cases, logging the stats will be helpful. its currently done in SegmentPreIndexStatsCollectorImpl

@rohityadav1993 this needs to be done

krishan1390 · 2025-10-13T06:07:32Z

...ain/java/org/apache/pinot/segment/local/segment/readers/PinotSegmentColumnReaderFactory.java

+ *   <li>Resource management for all created readers</li>
+ * </ul>
+ */
+public class PinotSegmentColumnReaderFactory implements ColumnReaderFactory {


I think this class / interface needs to be refactored a bit

init() shouldn't take targetSchema as that can be passed in constructor itself.

getAllColumnReaders() calls createColumnReader() and both are public which isn't ideal. Rather init() should call createColumnReader() and createColumnReader() should be private. and getAllColumnReaders() should just get existing objects without creating new objects

init is taking schema as it is trying to standardise how factory should be initialised. One is free to use any constructor signature needed. e.g. if data source is segment, the constructor is PinotSegmentColumnReaderFactory(IndexSegment indexSegment) and in case it was parquet, it can be ParquetColumnReaderFactory(File parquetFile)

renaming getAllColumnReaders() to createAllColumnReaders()

I don't think createAllColumnReaders() should be public.

Rather init() should call createColumnReader() and createColumnReader() should be private. and getAllColumnReaders() should just get existing objects without creating new objects

Revisited the interface for ColumnReaderFactory. There will be no create* methods exposed. The getMethods will return cached Readers and in future if needed one can add create methods if every time a new object needs to be created from the Factory. Same documentation in the interface.

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

krishan1390 · 2025-10-13T06:11:14Z

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

@@ -209,7 +249,7 @@ public void init(SegmentGeneratorConfig config, SegmentCreationDataSource dataSo
    }

    _ingestionSchemaValidator =
-        SchemaValidatorFactory.getSchemaValidator(_dataSchema, _recordReader.getClass().getName(),
+        SchemaValidatorFactory.getSchemaValidator(_dataSchema, readerClassName,


we're relying on getSchemaValidator to support null readerClassName which isn't ideal.

can we think of a more appropriate way to handle this ?

I found no use of getIngestionSchemaValidator() outside test classes. There is no change being made in getSchemaValidator() which is creating a SchemaValidator

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

...st/java/org/apache/pinot/segment/local/segment/creator/impl/ColumnarSegmentBuildingTest.java

...c/main/java/org/apache/pinot/segment/local/segment/readers/PinotSegmentColumnReaderImpl.java

...ava/org/apache/pinot/segment/local/segment/creator/impl/ColumnarSegmentBuildingTestBase.java

...st/java/org/apache/pinot/segment/local/segment/creator/impl/ColumnarSegmentBuildingTest.java

...ava/org/apache/pinot/segment/local/segment/creator/impl/ColumnarSegmentBuildingTestBase.java

...st/java/org/apache/pinot/segment/local/segment/creator/impl/ColumnarSegmentBuildingTest.java

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

rohityadav1993 · 2025-10-16T17:58:12Z

@noob-se7en, @KKcorps can you take a look again, all the review comments are addressed.

noob-se7en

LGTM

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java

krishan1390 · 2025-10-28T07:07:13Z

@rohityadav1993 whats the plan for the open comments and failing tests and merge conflicts ?

rohityadav1993 · 2025-10-31T16:25:51Z

Merge conflicts resolved and comments addressed.

rohityadav1993 · 2025-10-31T16:29:54Z

@Jackie-Jiang , @KKcorps please review and help merge this soon.

noob-se7en · 2025-11-04T06:48:59Z

@krishan1390 pls approve it.
And @swaminathanmanish pls help merging this PR after that.

rohityadav1993 changed the title ~~Columnar segment build clean~~ build segment columnarly from a column major datasource Sep 2, 2025

rohityadav1993 force-pushed the columnar-segment-build-clean branch from 2d4eb54 to b9dd171 Compare September 3, 2025 06:34

rohityadav1993 commented Sep 3, 2025

View reviewed changes

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java Outdated Show resolved Hide resolved

rohityadav1993 commented Sep 3, 2025

View reviewed changes

...he/pinot/segment/local/segment/creator/impl/stats/ColumnarSegmentPreIndexStatsContainer.java Outdated Show resolved Hide resolved

rohityadav1993 commented Sep 3, 2025

View reviewed changes

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java Outdated Show resolved Hide resolved

rohityadav1993 changed the title ~~build segment columnarly from a column major datasource~~ column major segment build for columnar datasource Sep 3, 2025

rohityadav1993 marked this pull request as ready for review September 3, 2025 09:25

rohityadav1993 mentioned this pull request Sep 3, 2025

add option for column major build for refresh segment task #16438

Closed

Jackie-Jiang added feature ingestion labels Sep 3, 2025

Jackie-Jiang added release-notes Referenced by PRs that need attention when compiling the next release notes performance labels Sep 16, 2025

tarun11Mavani reviewed Sep 17, 2025

View reviewed changes

Jackie-Jiang reviewed Sep 17, 2025

View reviewed changes

tarun11Mavani approved these changes Sep 22, 2025

View reviewed changes

rohityadav1993 force-pushed the columnar-segment-build-clean branch from bb0337e to e68a987 Compare September 30, 2025 16:31

Jackie-Jiang reviewed Oct 6, 2025

View reviewed changes

noob-se7en reviewed Oct 8, 2025

View reviewed changes

krishan1390 suggested changes Oct 13, 2025

View reviewed changes

rohityadav1993 force-pushed the columnar-segment-build-clean branch 4 times, most recently from 7d88e80 to 03d189c Compare October 16, 2025 17:50

noob-se7en approved these changes Oct 17, 2025

View reviewed changes

...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java Show resolved Hide resolved

rohityadav1993 added 6 commits October 31, 2025 13:58

build segment columnarly from a column major datasource

80457ce

resolve review comments 1/2

9e3aa21

resolve review comments 2/2

be3910b

always close column readers

7365bdd

refactor ColumnReaderFactory

3962308

resolved all review comments

bcf6997

rohityadav1993 force-pushed the columnar-segment-build-clean branch from 59b2136 to bcf6997 Compare October 31, 2025 14:22

chenboat requested a review from Jackie-Jiang November 3, 2025 18:48

xiangfu0 approved these changes Nov 4, 2025

View reviewed changes

xiangfu0 merged commit 13da6e3 into apache:master Nov 4, 2025
18 checks passed

krishan1390 mentioned this pull request Nov 10, 2025

Add interfaces to get partition columns and partition values from column value #17171

Merged

Conversation

rohityadav1993 commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Feature

Background

Design and Implementation

Tests

Performance benchmark

Usage

Uh oh!

codecov-commenter commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rohityadav1993 commented Sep 5, 2025

Uh oh!

tarun11Mavani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tarun11Mavani left a comment

Choose a reason for hiding this comment

Uh oh!

rohityadav1993 commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rohityadav1993 commented Sep 29, 2025

Uh oh!

Jackie-Jiang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

noob-se7en left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rohityadav1993 commented Sep 1, 2025 •

edited

Loading

codecov-commenter commented Sep 1, 2025 •

edited

Loading

rohityadav1993 commented Sep 29, 2025 •

edited

Loading