-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: Implement dictionary based rowgroup skipping for dictionary encoded data #14907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Implement dictionary based rowgroup skipping for dictionary encoded data #14907
Conversation
✅ Deploy Preview for meta-velox canceled.
|
81228f9
to
bf0a6e0
Compare
762ec1f
to
4ca36e8
Compare
@majetideepak , @Yuhta Can you please have a first pass at this, this PR is ready, I am just figuring out a way to setup testing for opensource since our test files are internal |
4ca36e8
to
1aa7915
Compare
return false; | ||
} | ||
|
||
bool ParquetData::testFilterAgainstDictionary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main change which reads the first page of column chunk and applies filters
@majetideepak @Yuhta @pedroerp I have added a test case which fails without the changes (number of rows are > 0). |
6168313
to
3cfd907
Compare
3cfd907
to
b5b46c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably mainly because we are missing SelectiveIntegerDictionaryColumnReader and SelectiveStringDictionaryColumnReader in Parquet like we have in DWRF. Basically we apply the actual filter on the dictionary values and cache the results. It would be nice if we implement the actual filter instead of just statistic filters.
subfieldSpecs.clear(); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just revert these white space change?
Thanks @Yuhta for the review yeah i think that should be done for non skipped rowgroups |
47181f9
to
01cd836
Compare
01cd836
to
319426a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are mostly code writing comments. Adding Ying for a review of the logic.
|
||
std::optional<int64_t> ColumnChunkMetaDataPtr::bloom_filter_offset() const { | ||
if (hasBloomFilter()) { | ||
return thriftColumnChunkPtr(ptr_)->meta_data.bloom_filter_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the offset be 0 ? If offset is 0 would it be preferred to return nullopt instead ?
thriftColumnChunkPtr(ptr_)->meta_data.__isset.bloom_filter_offset; | ||
} | ||
|
||
std::optional<int64_t> ColumnChunkMetaDataPtr::bloom_filter_offset() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : function name should be camel case bloomFilterOffset()
return thriftColumnChunkPtr(ptr_)->meta_data.encoding_stats; | ||
} | ||
|
||
const std::vector<thrift::Encoding::type>& ColumnChunkMetaDataPtr::getEncoding() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : Function name getEncodings()
auto parquetData = | ||
std::make_unique<ParquetData>(type, metaData_, pool(), sessionTimezone_); | ||
// Set the BufferedInput if available | ||
if (bufferedInput_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this related to your change ? Can this be an independent change ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related, we need bufferInput_ to read the first page
#include "velox/dwio/common/BufferedInput.h" | ||
#include "velox/dwio/parquet/reader/ParquetStatsContext.h" | ||
|
||
#include <thrift/protocol/TCompactProtocol.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move these headers before the velox ones.
const ColumnChunkMetaDataPtr& columnChunk) { | ||
// Use existing stream if available | ||
if (rowGroupId < streams_.size() && streams_[rowGroupId]) { | ||
return std::move(streams_[rowGroupId]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need std::move() here ?
Though if streams_ is a variable owned by this class then its not a good idea to return this unique_ptr as it will move the ownership from the class to a local variable in the caller of this function.
|
||
bool ParquetData::testFilterAgainstDictionary( | ||
uint32_t rowGroupId, | ||
const common::Filter* filter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a pointer and not a reference ?
|
||
auto dictionaryPtr = readDictionaryPageForFiltering(rowGroupId, columnChunk); | ||
if (!dictionaryPtr->values || dictionaryPtr->numValues == 0) { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this return true and not false ?
} | ||
|
||
// Helper methods for EncodingStats analysis (like Java Presto) | ||
bool ParquetData::hasDictionaryPages( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods needn't be class methods. They can be added in an anonymous namespace.
|
||
bool ParquetData::hasNonDictionaryEncodedPages( | ||
const std::vector<thrift::PageEncodingStats>& stats) { | ||
for (const auto& pageStats : stats) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. This can be a method in an anonymous namespace.
@yingsu00 : Please review. I'm not able to add you as a reviewer. |
Fixes #15071
This PR implements dictionary based rowgroup skipping for parquet
Presto has row group skipping for dictionary encoded pages (link). This is very efficient as this skips entire rowgroups if the dictionary values don't match the domain and metadata filters.
The first page of a dictionary encoded column is always a dictionary page in every rowgroup
Without this feature, some of our java traffic migration was blocked due to cluster slowless and load.
With this In production we see > 8x CPU improvements and > 3x rows read decrease for relevant queries