-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Replace GroupIdBlock with int[] #17688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
796cf6c
to
a730466
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This should bring some perf improvement, I have seen 20% perf drop in some queries because of group id code was not compiled optimally by jit. using int[] will likely help with that.
RunLengthEncodedBlock.create( | ||
BIGINT.createFixedSizeBlockBuilder(1).writeLong(groupId).build(), | ||
block.getPositionCount())); | ||
int[] result = new int[block.getPositionCount()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be beneficial to handle this case better?
We could still have GroupIdBLock
that has two states, int[] and single value and count similar to SelectedPositions
and then do
if (groupIdsBlock.isArray()) {
// handle array
} else {
// handle rle
}
``` in the `Accumulator`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. I'm not sure the complexity will result in better performance in the 99% case where we don't have RLEs. My thought is we switch to int[]
and establish a new base line. Then in follow up work we can investigate if adding wrapper with RLE support helps.
Variable groupIdsBlock = scope.getVariable("groupIdsBlock"); | ||
loopBody.append(thisVariable.getField(stateField).invoke("setGroupId", void.class, groupIdsBlock.invoke("getGroupId", long.class, position))); | ||
Variable groupIds = scope.getVariable("groupIds"); | ||
loopBody.append(thisVariable.getField(stateField).invoke("setGroupId", void.class, groupIds.getElement(position).cast(long.class))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to also make io.trino.spi.function.GroupedAccumulatorState#setGroupId
accept int
instead of long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! That will be in some follow up work.
a730466
to
1fc7432
Compare
Group id block was used to carry the max group id to the grouped accumulator, but the max group id is now set directly with the setGroupCount method. An int[] is used because the GroupByHash implementations only create int group ids.
1fc7432
to
ca61901
Compare
Group id block was used to carry the max group id to the grouped accumulator, but the max group id is now set directly with the setGroupCount method.
An int[] is used because the GroupByHash implementations only create int group ids.
Release notes
(x) This is not user-visible or docs only and no release notes are required.