Skip to content

Conversation

jaystarshot
Copy link
Contributor

@jaystarshot jaystarshot commented Sep 19, 2025

Fixes #15071

This PR implements dictionary based rowgroup skipping for parquet

Presto has row group skipping for dictionary encoded pages (link). This is very efficient as this skips entire rowgroups if the dictionary values don't match the domain and metadata filters.

The first page of a dictionary encoded column is always a dictionary page in every rowgroup

Without this feature, some of our java traffic migration was blocked due to cluster slowless and load.
With this In production we see > 8x CPU improvements and > 3x rows read decrease for relevant queries

Copy link

netlify bot commented Sep 19, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 2706c36
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68e573cff512770008bca01c

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 19, 2025
@jaystarshot jaystarshot changed the title feat: support parquet dictionary filter based rowgroup skipping feat: support parquet dictionary filter rowgroup skipping Sep 19, 2025
@jaystarshot jaystarshot changed the title feat: support parquet dictionary filter rowgroup skipping feat: support parquet dictionary filter rowgroup skipping (Plain_dictionary) encoding Sep 19, 2025
@jaystarshot jaystarshot changed the title feat: support parquet dictionary filter rowgroup skipping (Plain_dictionary) encoding feat: support parquet dictionary filter rowgroup skipping Sep 22, 2025
@jaystarshot jaystarshot force-pushed the jay-oss-dict-filter branch 2 times, most recently from 81228f9 to bf0a6e0 Compare September 24, 2025 17:42
@jaystarshot jaystarshot changed the title feat: support parquet dictionary filter rowgroup skipping feat[parquet]: implement dictionary based rowgroup skipping for dictionary encoded data Sep 24, 2025
@jaystarshot jaystarshot changed the title feat[parquet]: implement dictionary based rowgroup skipping for dictionary encoded data feat[parquet]: Implement dictionary based rowgroup skipping for dictionary encoded data Sep 24, 2025
@jaystarshot jaystarshot force-pushed the jay-oss-dict-filter branch 2 times, most recently from 762ec1f to 4ca36e8 Compare September 24, 2025 17:45
@jaystarshot jaystarshot marked this pull request as ready for review September 24, 2025 17:45
@jaystarshot
Copy link
Contributor Author

jaystarshot commented Sep 24, 2025

@majetideepak , @Yuhta Can you please have a first pass at this, this PR is ready, I am just figuring out a way to setup testing for opensource since our test files are internal

return false;
}

bool ParquetData::testFilterAgainstDictionary(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main change which reads the first page of column chunk and applies filters

@jaystarshot jaystarshot changed the title feat[parquet]: Implement dictionary based rowgroup skipping for dictionary encoded data feat: Implement dictionary based rowgroup skipping for dictionary encoded data Oct 7, 2025
@jaystarshot
Copy link
Contributor Author

jaystarshot commented Oct 7, 2025

@majetideepak @Yuhta @pedroerp I have added a test case which fails without the changes (number of rows are > 0).
We have also extensively tested this change using shadow rowgroup skipping (signal when skipped and fail query if num rows processed are > 0) in production

@jaystarshot jaystarshot force-pushed the jay-oss-dict-filter branch 2 times, most recently from 6168313 to 3cfd907 Compare October 7, 2025 05:02
Copy link
Contributor

@Yuhta Yuhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably mainly because we are missing SelectiveIntegerDictionaryColumnReader and SelectiveStringDictionaryColumnReader in Parquet like we have in DWRF. Basically we apply the actual filter on the dictionary values and cache the results. It would be nice if we implement the actual filter instead of just statistic filters.

subfieldSpecs.clear();
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just revert these white space change?

@jaystarshot
Copy link
Contributor Author

Thanks @Yuhta for the review yeah i think that should be done for non skipped rowgroups

@jaystarshot jaystarshot force-pushed the jay-oss-dict-filter branch 3 times, most recently from 47181f9 to 01cd836 Compare October 7, 2025 20:09
Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are mostly code writing comments. Adding Ying for a review of the logic.


std::optional<int64_t> ColumnChunkMetaDataPtr::bloom_filter_offset() const {
if (hasBloomFilter()) {
return thriftColumnChunkPtr(ptr_)->meta_data.bloom_filter_offset;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the offset be 0 ? If offset is 0 would it be preferred to return nullopt instead ?

thriftColumnChunkPtr(ptr_)->meta_data.__isset.bloom_filter_offset;
}

std::optional<int64_t> ColumnChunkMetaDataPtr::bloom_filter_offset() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : function name should be camel case bloomFilterOffset()

return thriftColumnChunkPtr(ptr_)->meta_data.encoding_stats;
}

const std::vector<thrift::Encoding::type>& ColumnChunkMetaDataPtr::getEncoding()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : Function name getEncodings()

auto parquetData =
std::make_unique<ParquetData>(type, metaData_, pool(), sessionTimezone_);
// Set the BufferedInput if available
if (bufferedInput_) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this related to your change ? Can this be an independent change ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related, we need bufferInput_ to read the first page

#include "velox/dwio/common/BufferedInput.h"
#include "velox/dwio/parquet/reader/ParquetStatsContext.h"

#include <thrift/protocol/TCompactProtocol.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these headers before the velox ones.

const ColumnChunkMetaDataPtr& columnChunk) {
// Use existing stream if available
if (rowGroupId < streams_.size() && streams_[rowGroupId]) {
return std::move(streams_[rowGroupId]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need std::move() here ?

Though if streams_ is a variable owned by this class then its not a good idea to return this unique_ptr as it will move the ownership from the class to a local variable in the caller of this function.


bool ParquetData::testFilterAgainstDictionary(
uint32_t rowGroupId,
const common::Filter* filter,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a pointer and not a reference ?


auto dictionaryPtr = readDictionaryPageForFiltering(rowGroupId, columnChunk);
if (!dictionaryPtr->values || dictionaryPtr->numValues == 0) {
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this return true and not false ?

}

// Helper methods for EncodingStats analysis (like Java Presto)
bool ParquetData::hasDictionaryPages(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods needn't be class methods. They can be added in an anonymous namespace.


bool ParquetData::hasNonDictionaryEncodedPages(
const std::vector<thrift::PageEncodingStats>& stats) {
for (const auto& pageStats : stats) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. This can be a method in an anonymous namespace.

@aditi-pandit
Copy link
Collaborator

@yingsu00 : Please review. I'm not able to add you as a reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Parquet] Implement Dictionary based rowgroup skipping for dictionary encoded data
3 participants