-
Notifications
You must be signed in to change notification settings - Fork 3.9k
ARROW-4147: [Java] reduce heap usage for varwidth vectors #3298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pravindra
commented
Jan 3, 2019
- some code reorg to avoid duplication
- changed the default initial alloc from 4096 to 3970
- some code reorg to avoid duplication - changed the default initial alloc from 4096 to 3970
Codecov Report
@@ Coverage Diff @@
## master #3298 +/- ##
==========================================
- Coverage 88.58% 88.58% -0.01%
==========================================
Files 546 546
Lines 73059 73057 -2
==========================================
- Hits 64721 64719 -2
Misses 8233 8233
Partials 105 105
Continue to review full report at Codecov.
|
|
addressed review comments in the parent PR #3246 |
| /* | ||
| * For all fixed width vectors, the value and validity buffers are sliced from a single buffer. | ||
| * Similarly, for variable width vectors, the offsets and validity buffers are sliced from a | ||
| * single buffer. To ensure the single buffer is power-of-2 size, the initial value allocation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what exactly is the value capacity we are allocating for. If the joint buffer that will be sliced has first 512 bytes for bitmap then we are having a value count of 4096 for the validity buffer and something less for data buffer. I remember that validity and data buffer move together (alloc, realloc and then load). Is that not true now and the value capacity of each buffer will vary slightly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change the vector.getValueCapacity() method to now take min (validity buffer value capacity, data buffer value capacity) since these buffers no longer move together.
Initially when default starting value capacity was 4096, we allocated 512 bytes of validity buffer and let's sat 4096*4 = 16KB of data buffer for an IntVector. Now if we stick to the same numbers for the join buffer allocation, it is evident that we will over-allocate due to power of 2 semantics since 512 + 16KB will ask for a chunk of 32KB.
It seems like in the revised scheme we are still keeping 512 bytes for validity buffer (what I see from comments), so for the above example, we would be asking 512 + 3970*4 => something more than 16KB so still doesn't help.
However, if we validity size is 3970/8 = 496 and data buffer size is 3970*4, then we will allocate 16KB chunk since the compound size is just under 16KB. From your comments, it implies that we are still keeping 512 bytes for validity buffer so I am a bit confused here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the confusion - I meant to say approx. 512. I've fixed the comment now (it's 504 bytes for the bitmap). Also, added a test to verify that the additional allocation is < 5%.
Regarding the other questions
- the validity and value buffers still move together. in both alloc/realloc, we allocate a single buffer and slice into two.
- the allocated value capacity can be slightly > INITIAL_VALUE_ALLOCATION, depending on the vector type.
- the vector.getValueCapacity() already takes a minimum of validity buffer capacity and value buffer capacity (even before my change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks
|
I'm fine in general on this but it would be great if @siddharthteotia could go through in detail. |
| // for boolean type, value-buffer and validity-buffer are of same size. | ||
| bufferSize *= 2; | ||
| } else { | ||
| bufferSize += roundUp8(valueCount * typeWidth); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data buffer size should be evenly divisible by the type size right? Why are we using the round up to multiple of 8 for all types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the bitmap slice starts after the data slice. we need the bitmap to be on a 8-byte boundary - allows us to treat the bitmap as an array of uint64 (in dremio pivot/unpivot, in gandiva).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
siddharthteotia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
thanks @siddharthteotia |
* ARROW-4147: reduce heap usage for varwidth vectors - some code reorg to avoid duplication - changed the default initial alloc from 4096 to 3970 * ARROW-4147: [Java] Address review comments * ARROW-4147: remove check on width to be <= 16: * ARROW-4147: allow initial valueCount to be 0. * ARROW-4147: Fix incorrect comment on initial alloc