Always use Unsafe as the default PinotBufferFactory#13639
Always use Unsafe as the default PinotBufferFactory#13639gortiz merged 1 commit intoapache:masterfrom
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #13639 +/- ##
============================================
+ Coverage 61.75% 61.99% +0.24%
+ Complexity 207 198 -9
============================================
Files 2436 2553 +117
Lines 133233 140468 +7235
Branches 20636 21849 +1213
============================================
+ Hits 82274 87085 +4811
- Misses 44911 46749 +1838
- Partials 6048 6634 +586
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
e8159fe to
93a2108
Compare
|
@gortiz - how will this work for existing segments in production that currently LBuffer writer and therefore depend on the corresponding reader ? I see that in #13545 you mentioned about incrementally working towards deleting LArray starting by making the new implementation as default one. I understand that new segments will pick up the new writer (Unsafe based). But how do we take care of rebuilding existing segments ? We need to have an approach that will work for production users. Otherwise we will always have to keep the Larray reader code around. May be we should do something like we did for segment v2 -> v3 format migration? IIRC, during pre-processor we detected V2 and moved to V3. Thoughts @vvivekiyer @Jackie-Jiang ? |
|
Good question. The answer is... it doesn't matter 😄 . The library used to implement buffers does not require or modify the buffer on disk. I mean, a segment might have been created using LArray and can be read using Unsafe and the other way around. These libraries do not modify the content on disk in any way. They just change the jvm/system calls used to load the content on memory. Basically LArray uses some JNI+sun.misc.Unsafe calls to allocate and load memory while Unsafe just use sun.misc.Unsafe. But in the end both libraries end up mmaping the content of a file indicated with an offset and length. |
vvivekiyer
left a comment
There was a problem hiding this comment.
@siddharthteotia Verified that this will not cause any issues for us. We already have many hosts running java 17 and using Unsafe.
We have been using UnsafePinotBufferFactory for a while with no regression detected. I think it is a good time to start working on the plan to remove LArray dependency (see #12810)
This PR changes the default PinotBufferFactory used in Pinot. Before we were using LArrayPinotBufferFactory on Java < 16 but give LArray doesn't work in Java >= 16, we used UnsafePinotBufferFactory in newer JVMs. With this PR, we always use UnsafePinotBufferFactory.
Users that want to still use LArrayPinotBufferFactory can do that by providing the following config to Pinot:
Remember that this is not recommended given LArray implementation has been proved to be incorrect several times (see for example https://github.com/apache/pinot/pull/10774).\
A side effect of this PR is that Pinot should be able to be executed (and probably compiled) in Mac computers using aarch64 (like M1, M2, etc) without having to enable Rosetta. We need to open a parallel PR to change this doc page and probably this one. There are still some issues in ARMs (AFAIR CLP lib does not include a binary for Mac + arm64, @suddendust may know more about that) and maybe other build pages that may not be updated.