Skip to content

column major segment build for columnar datasource#16727

Merged
xiangfu0 merged 6 commits intoapache:masterfrom
rohityadav1993:columnar-segment-build-clean
Nov 4, 2025
Merged

column major segment build for columnar datasource#16727
xiangfu0 merged 6 commits intoapache:masterfrom
rohityadav1993:columnar-segment-build-clean

Conversation

@rohityadav1993
Copy link
Contributor

@rohityadav1993 rohityadav1993 commented Sep 1, 2025

feature performance release-notes

Feature

Feature to build segment in columnar fashion for columnar input data sources like Pinot segment, Parquet files, etc.

PR split from previous previous abandoned PR: #16438 based on feedback.

Background

The requirement arises from issue: #16461:
In short, when minion refreshes a servers segment for schema or index config change. It rebuilds segment row wise. There are two bottlenecks observed:

  1. Row wise segment stats collection for each column in SegmentIndexCreationDriverImpl
  2. Row wise segment generation in SegmentIndexCreationDriverImpl

Record wise generation also reprocesses transformations. This is not needed unless there is a change in transformation logic both in gather stats step and segment regeneration step. The bottlenecks can be seen in the CPU profile in the attached issue #16461

Design and Implementation

Changes

  1. Interface ColumnReader: used to read column data from various data sources for columnar segment building
    1. Sequential iteration over all values in a column using hasNext() and next()
    2. Null value detection for the current value
    3. Impl: PinotSegmentColumnReaderImpl
  2. Interface ColumnReaderFactory: create column readers for various data sources such as Pinot segments, Parquet files, etc
    1. Creating column readers for existing columns
    2. Creating default value readers for new columns
    3. Data type conversions between source and target schemas
    4. Impl: PinotSegmentColumnReaderFactory
  3. ColumnarSegmentCreationDataSource: Columnar datasource and provides wrapper for columnReaders to be used in SegmentIndexCreationDriverImpl
  4. ColumnarSegmentPreIndexStatsContainer: stats collector for columnar datasouce to gatherStats in column manner
  5. Existing buildByColumn method is untouched as it is solely concerned with RealtimeSegmentConverter
  6. Handles new column addition using a new DefaultValueColumnReader abstraction and data type change by reusing

Tests

Tests for comparing data correctness for columnar build with scnearios: no change, column addition, data type change, index update.

Performance benchmark

Script: c7e128a#diff-ee55ab7e1bad4347b27fd9721b5cf253c25aa80347147242238a88e7601abfcf

Benchmark for 50 columns(25 dim, 25 metric), 1000000 rows, 3 inverted index

=== BENCHMARK RESULTS ===
Scenario             | Row-Major(ms) | Column-Major(ms) | Difference(ms) | Improvement(%)
------------------------------------------------------------------------------
data_type_change     | 49555        | 19208        | 30347        | 61.24       %
index_update         | 49958        | 19834        | 30124        | 60.30       %
new_column           | 50155        | 18989        | 31166        | 62.14       %
no_change            | 50554        | 20226        | 30328        | 59.99       %

Usage

  1. Implement ColumnReaderFactory based on datasource. e.g. PinotSegmentColumnReaderFactory is implemented in the current PR. This can be used in Minion tasks e.g. a follow up PR(3fcac1d) will add a config in RefreshSegmentMinion Task to build in column major order.
  2. Use it to load data from external sources e.g. Parquet files. This is unimplemented and can be added in a follow up PR. The implemetation should handle columnar reads and provide handling new column present in table, data type different in table from source data.
  3. There will be no handling for transformation as it is a row transform operation which is not applicable in this case.

@codecov-commenter
Copy link

codecov-commenter commented Sep 1, 2025

Codecov Report

❌ Patch coverage is 75.00000% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.25%. Comparing base (f3425c9) to head (bcf6997).
⚠️ Report is 63 commits behind head on master.

Files with missing lines Patch % Lines
...t/creator/impl/SegmentIndexCreationDriverImpl.java 71.64% 15 Missing and 4 partials ⚠️
...l/stats/ColumnarSegmentPreIndexStatsContainer.java 71.42% 13 Missing and 3 partials ⚠️
...gment/readers/PinotSegmentColumnReaderFactory.java 75.00% 7 Missing and 5 partials ⚠️
.../segment/readers/PinotSegmentColumnReaderImpl.java 69.56% 4 Missing and 3 partials ⚠️
...ment/creator/impl/SegmentColumnarIndexCreator.java 71.42% 5 Missing and 1 partial ⚠️
...segment/creator/impl/stats/StatsCollectorUtil.java 80.00% 1 Missing and 1 partial ⚠️
...ocal/segment/readers/DefaultValueColumnReader.java 88.88% 1 Missing and 1 partial ⚠️
...ent/creator/ColumnarSegmentCreationDataSource.java 92.85% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16727      +/-   ##
============================================
+ Coverage     63.23%   63.25%   +0.01%     
+ Complexity     1429     1428       -1     
============================================
  Files          3104     3110       +6     
  Lines        183349   183718     +369     
  Branches      28096    28147      +51     
============================================
+ Hits         115942   116206     +264     
- Misses        58447    58535      +88     
- Partials       8960     8977      +17     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.18% <75.00%> (-0.01%) ⬇️
java-21 63.20% <75.00%> (-0.01%) ⬇️
temurin 63.25% <75.00%> (+0.01%) ⬆️
unittests 63.24% <75.00%> (+0.01%) ⬆️
unittests1 56.10% <6.92%> (-0.16%) ⬇️
unittests2 33.68% <74.61%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rohityadav1993 rohityadav1993 changed the title Columnar segment build clean build segment columnarly from a column major datasource Sep 2, 2025
@rohityadav1993 rohityadav1993 force-pushed the columnar-segment-build-clean branch from 2d4eb54 to b9dd171 Compare September 3, 2025 06:34
@rohityadav1993 rohityadav1993 changed the title build segment columnarly from a column major datasource column major segment build for columnar datasource Sep 3, 2025
@rohityadav1993 rohityadav1993 marked this pull request as ready for review September 3, 2025 09:25
@rohityadav1993
Copy link
Contributor Author

@Jackie-Jiang, could you review or assign somebody to review this.

@Jackie-Jiang Jackie-Jiang added release-notes Referenced by PRs that need attention when compiling the next release notes performance labels Sep 16, 2025
Copy link
Contributor

@tarun11Mavani tarun11Mavani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added few comments.

* @param columnReader ColumnReader for the column data
* @throws IOException if indexing fails
*/
public void indexColumn(String columnName, ColumnReader columnReader) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can add this method signature to the interface SegmentCreator.


ColumnReader columnReader;

if (hasColumn(columnName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are skipping VirtualColumns in getAllColumnReaders but this can still be called for virtual column from somewhere else. We should skip VirtualColumns here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, adding an error scenario if invoked for virtual columns

* <p>Implementations should handle data type conversions, default values for new columns,
* and efficient column-wise data access patterns.
*/
public interface ColumnReader extends Closeable, Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly useful when reading from a column major file format such as Parquet. Can you check if this interface can be used for Parquet reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface looks good, but one will need to create a ParquetSegmentColumnReaderFactory similar to PinotSegmentColumnReaderFactory which should be able to correctly initialise a ParquetSegmentColumnReaderImpl for a given parquet file to be read.

On a high level Parquet files are analogous to a Pinot segment as collection of columns and there is a similar low level columnar reader: https://github.com/apache/parquet-java/blob/f50dd6cb4b526cf4b585993c1b69a838cd8151f3/parquet-column/src/main/java/org/apache/parquet/column/ColumnReader.java#L40

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do plan to implement this in future, but not an immediate priority. I can create an issue for this as follow up.

*
* @return true if the current value is null, false otherwise
*/
boolean isNull();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this when next() can return null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be handled in next(), let me update. Originally, idea was to add custom null check handling in isNull(), e.g. "null" string --> null

Copy link
Contributor

@tarun11Mavani tarun11Mavani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rohityadav1993
Copy link
Contributor Author

rohityadav1993 commented Sep 29, 2025

Offline discussion with @Jackie-Jiang :

  • Remove any transformation including data type which is not part of schema/index update rather ingestion transformation.
  • Follow up: Build columnar ingestion transformation and later improve columnar build to support transformation. Relates to requirements in Pinot and Arrow Integration #16643

@rohityadav1993
Copy link
Contributor Author

@Jackie-Jiang, this is ready to review.

@rohityadav1993 rohityadav1993 force-pushed the columnar-segment-build-clean branch from bb0337e to e68a987 Compare September 30, 2025 16:31
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall approach looks good!

@KKcorps @noob-se7en Can you help review this?

* @param columnReaders Map of column name to ColumnReader instances
*/
public ColumnarSegmentCreationDataSource(ColumnReaderFactory columnReaderFactory,
Map<String, ColumnReader> columnReaders) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(code format) Can you reformat all the changes with Pinot Style

Copy link
Contributor

@noob-se7en noob-se7en left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good, Added some minor comments. Will take another look tomorrow.

_columnStatsCollectorMap = new HashMap<>();
_totalDocCount = -1; // indicates unset

initializeStatsCollectors();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initializeStatsCollectors and collectColumnStats needn't be called in constructor right ?

initializeStatsCollectors should be called in init() method ?

and similarly collectColumnStats in build() method ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a requirement for lazy loading(initializeStatsCollectors) by invoking init separately.
Column stats are collected in build method, buildColumnar()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its cleaner and more easier to maintain to keep constructor light and have the methods do what they're supposed to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating, I don't see any benefit in delaying the initialization by calling it immediately after object creation. These are not service start up objects but more to be used in the segment generation flow where there is not much benefit by split contructor logic into init method and calling it immediately.

continue;
}

String columnName = fieldSpec.getName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably abstract the common logic in SegmentPreIndexStatsCollectorImpl and this class to populate _columnStatsCollectorMap to avoid redundant code.

Also SegmentPreIndexStatsCollectorImpl has added support for NoDictColumnStatisticsCollector which needs to be added here too - https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/stats/SegmentPreIndexStatsCollectorImpl.java#L61

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegmentPreIndexStatsCollectorImpl is heavily assumes row major iteration using method collectRow()

We are now using buildByColumn as default for mutable segment to immutable segment conversion and this is a new addition buildColumnar for immutable datasource to immutableSegment conversion in column major order so the above should get phased out after we are completely columnar with support for transformations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. but avoiding redundant code will be helpful to maintain the code until it exists and keep them consistent for common logic. eg: NoDictColumnStatisticsCollector was missing in this class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohityadav1993 we should do this change before merging the PR

cc @KKcorps @noob-se7en

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligned on reducing redundancy, extracting the common logic in a class called StatsCollectorUtils.


@Override
public void logStats() {
LOGGER.info("Columnar segment stats collection completed for {} columns with {} documents",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should log the full stats here. same comment as earlier to refactor common code of this class and SegmentPreIndexStatsCollectorImpl to get same benefits without duplicate code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column stats get persisted as segment metadata in segment file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

segment metadata won't be available on minion hosts. and might be updated on server if the segment is refreshed multiple times. and will be unavailable if segment is deleted ,etc

for retrospective debugging in such cases, logging the stats will be helpful. its currently done in SegmentPreIndexStatsCollectorImpl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohityadav1993 this needs to be done

* <li>Resource management for all created readers</li>
* </ul>
*/
public class PinotSegmentColumnReaderFactory implements ColumnReaderFactory {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class / interface needs to be refactored a bit

  1. init() shouldn't take targetSchema as that can be passed in constructor itself.
  2. getAllColumnReaders() calls createColumnReader() and both are public which isn't ideal. Rather init() should call createColumnReader() and createColumnReader() should be private. and getAllColumnReaders() should just get existing objects without creating new objects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init is taking schema as it is trying to standardise how factory should be initialised. One is free to use any constructor signature needed. e.g. if data source is segment, the constructor is PinotSegmentColumnReaderFactory(IndexSegment indexSegment) and in case it was parquet, it can be ParquetColumnReaderFactory(File parquetFile)

renaming getAllColumnReaders() to createAllColumnReaders()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think createAllColumnReaders() should be public.

Rather init() should call createColumnReader() and createColumnReader() should be private. and getAllColumnReaders() should just get existing objects without creating new objects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisited the interface for ColumnReaderFactory. There will be no create* methods exposed. The getMethods will return cached Readers and in future if needed one can add create methods if every time a new object needs to be created from the Factory. Same documentation in the interface.

@@ -209,7 +249,7 @@ public void init(SegmentGeneratorConfig config, SegmentCreationDataSource dataSo
}

_ingestionSchemaValidator =
SchemaValidatorFactory.getSchemaValidator(_dataSchema, _recordReader.getClass().getName(),
SchemaValidatorFactory.getSchemaValidator(_dataSchema, readerClassName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're relying on getSchemaValidator to support null readerClassName which isn't ideal.

can we think of a more appropriate way to handle this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found no use of getIngestionSchemaValidator() outside test classes. There is no change being made in getSchemaValidator() which is creating a SchemaValidator

@rohityadav1993 rohityadav1993 force-pushed the columnar-segment-build-clean branch 4 times, most recently from 7d88e80 to 03d189c Compare October 16, 2025 17:50
@rohityadav1993
Copy link
Contributor Author

@noob-se7en, @KKcorps can you take a look again, all the review comments are addressed.

Copy link
Contributor

@noob-se7en noob-se7en left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krishan1390
Copy link
Contributor

@rohityadav1993 whats the plan for the open comments and failing tests and merge conflicts ?

@rohityadav1993 rohityadav1993 force-pushed the columnar-segment-build-clean branch from 59b2136 to bcf6997 Compare October 31, 2025 14:22
@rohityadav1993
Copy link
Contributor Author

Merge conflicts resolved and comments addressed.

@rohityadav1993
Copy link
Contributor Author

@Jackie-Jiang , @KKcorps please review and help merge this soon.

@chenboat chenboat requested a review from Jackie-Jiang November 3, 2025 18:48
@noob-se7en
Copy link
Contributor

@krishan1390 pls approve it.
And @swaminathanmanish pls help merging this PR after that.

@xiangfu0 xiangfu0 merged commit 13da6e3 into apache:master Nov 4, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature ingestion performance release-notes Referenced by PRs that need attention when compiling the next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants