Add segment end criteria check for SVForwardIndex and Dictionary#15120
Add segment end criteria check for SVForwardIndex and Dictionary#15120Jackie-Jiang merged 7 commits intoapache:masterfrom
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #15120 +/- ##
============================================
+ Coverage 62.78% 62.86% +0.07%
- Complexity 1382 1384 +2
============================================
Files 2863 2866 +3
Lines 162780 163287 +507
Branches 24915 24936 +21
============================================
+ Hits 102203 102648 +445
- Misses 52871 52922 +51
- Partials 7706 7717 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
e4f83f7 to
d00eaef
Compare
|
@noob-se7en @KKcorps Can you help take a look? |
...-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/MutableDictionary.java
Outdated
Show resolved
Hide resolved
...ache/pinot/segment/local/segment/index/forward/mutable/VarByteSVMutableForwardIndexTest.java
Outdated
Show resolved
Hide resolved
Co-authored-by: NOOB <43700604+noob-se7en@users.noreply.github.com>
|
@noob-se7en resolved you comments, could you pls review again? Thanks |
...-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/MutableDictionary.java
Outdated
Show resolved
Hide resolved
| // the valid doc ids won't be updated. | ||
| private final ThreadSafeMutableRoaringBitmap _validDocIds; | ||
| private final ThreadSafeMutableRoaringBitmap _queryableDocIds; | ||
| private boolean _indexCapacityThresholdBreached; |
There was a problem hiding this comment.
Is this threshold check for all indexes/dictionaries in one segment?
Suggest to rename to _indexThresholdReached and also add javadoc.
There was a problem hiding this comment.
Yes, it is one for all check. The PR is only extracting the variable to an instance field. The name was unchanged. But I am open to rename.
|
Can you comment or add documentation on this PR what is the current behavior if the size threshold is reached? |
noob-se7en
left a comment
There was a problem hiding this comment.
Looks good. Thank you!
@Jackie-Jiang @KKcorps please help with merging the PR.
enhancementingestionInspired by #14479
Apart from the MVForwardIndex check added in that PR, this PR adds similar check for SVForwardIndex and Dictionary. The PR only adds the interface and UTs.
The actual check policy is not within this PR and could be an open question to discuss. The end criteria needs to consider
While it is tricky to make it correct as:
optimizeDictionaryenabled, Dictionary encoding could also end in immutable forward index. It would be even harder to guess- If Dictionary would be converted to ForwardIndex
- If converted, what would be the final compressed size
Some proposed policies: