[enhancement] [release-note] Update get bytes to return raw bytes of string and support getBytesMV#9441
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9441 +/- ##
============================================
+ Coverage 63.53% 69.84% +6.30%
- Complexity 5072 5136 +64
============================================
Files 1836 1910 +74
Lines 98545 101832 +3287
Branches 15062 15451 +389
============================================
+ Hits 62614 71122 +8508
+ Misses 31313 25699 -5614
- Partials 4618 5011 +393
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Mostly minor comments
pinot-core/src/main/java/org/apache/pinot/core/common/DataBlockCache.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
| } | ||
| break; | ||
| case BYTES: | ||
| case STRING: |
There was a problem hiding this comment.
We can remove the switch and directly read the values
There was a problem hiding this comment.
I thought we don't want to support this for other types?
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
...a/org/apache/pinot/segment/local/realtime/impl/dictionary/StringOnHeapMutableDictionary.java
Outdated
Show resolved
Hide resolved
| String unpaddedString = getUnpaddedString(i, buffer); | ||
| _unpaddedStrings[i] = unpaddedString; | ||
| _unPaddedStringToIdMap.put(unpaddedString, i); | ||
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); |
There was a problem hiding this comment.
Only read the unpadded bytes, and directly create unpaddedString from the unpadded bytes for better performance
There was a problem hiding this comment.
The performance will not be good because we have do encode every time if we want to read unpadded string? Or you are talking about space efficiency?
There was a problem hiding this comment.
I'm talking about the bytes/string reading during the construction
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); | |
| byte[] unpaddedBytes = getUnpaddedBytes(i, buffer); | |
| String unpaddedString = new String(unpaddedBytes, UTF_8); | |
| ... |
...cal/src/main/java/org/apache/pinot/segment/local/segment/index/readers/StringDictionary.java
Outdated
Show resolved
Hide resolved
...egment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/MutableForwardIndex.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataBlockCache.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
pinot-core/src/test/java/org/apache/pinot/core/common/DataFetcherTest.java
Outdated
Show resolved
Hide resolved
pinot-core/src/test/java/org/apache/pinot/core/common/DataFetcherTest.java
Show resolved
Hide resolved
| String unpaddedString = getUnpaddedString(i, buffer); | ||
| _unpaddedStrings[i] = unpaddedString; | ||
| _unPaddedStringToIdMap.put(unpaddedString, i); | ||
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); |
There was a problem hiding this comment.
I'm talking about the bytes/string reading during the construction
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); | |
| byte[] unpaddedBytes = getUnpaddedBytes(i, buffer); | |
| String unpaddedString = new String(unpaddedBytes, UTF_8); | |
| ... |
.../apache/pinot/segment/local/segment/index/readers/ImmutableDictionaryTypeConversionTest.java
Outdated
Show resolved
Hide resolved
.../apache/pinot/segment/local/segment/index/readers/ImmutableDictionaryTypeConversionTest.java
Outdated
Show resolved
Hide resolved
...-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/ForwardIndexReader.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/apache/pinot/core/operator/transform/function/BaseTransformFunction.java
Outdated
Show resolved
Hide resolved
…string and support getBytesMV (apache#9441) * get raw bytes for string value column * get raw bytes for string value column * support getBytesMV * fix fail test * fix style * Address sytle commets * address comments
This change is backward incompatible: