Skip to content

Replace GroupIdBlock with int[] #17688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2023
Merged

Conversation

dain
Copy link
Member

@dain dain commented May 29, 2023

Group id block was used to carry the max group id to the grouped accumulator, but the max group id is now set directly with the setGroupCount method.
An int[] is used because the GroupByHash implementations only create int group ids.

Release notes

(x) This is not user-visible or docs only and no release notes are required.

@dain dain requested a review from martint May 29, 2023 22:04
@cla-bot cla-bot bot added the cla-signed label May 29, 2023
@dain dain mentioned this pull request May 29, 2023
19 tasks
@dain dain force-pushed the remove-group-id-block branch from 796cf6c to a730466 Compare May 30, 2023 00:55
Copy link
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This should bring some perf improvement, I have seen 20% perf drop in some queries because of group id code was not compiled optimally by jit. using int[] will likely help with that.

RunLengthEncodedBlock.create(
BIGINT.createFixedSizeBlockBuilder(1).writeLong(groupId).build(),
block.getPositionCount()));
int[] result = new int[block.getPositionCount()];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be beneficial to handle this case better?
We could still have GroupIdBLock that has two states, int[] and single value and count similar to SelectedPositions and then do

if (groupIdsBlock.isArray()) {
  // handle array
} else {
  // handle rle
}
``` in the `Accumulator`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I'm not sure the complexity will result in better performance in the 99% case where we don't have RLEs. My thought is we switch to int[] and establish a new base line. Then in follow up work we can investigate if adding wrapper with RLE support helps.

Variable groupIdsBlock = scope.getVariable("groupIdsBlock");
loopBody.append(thisVariable.getField(stateField).invoke("setGroupId", void.class, groupIdsBlock.invoke("getGroupId", long.class, position)));
Variable groupIds = scope.getVariable("groupIds");
loopBody.append(thisVariable.getField(stateField).invoke("setGroupId", void.class, groupIds.getElement(position).cast(long.class)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to also make io.trino.spi.function.GroupedAccumulatorState#setGroupId accept int instead of long?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! That will be in some follow up work.

@dain dain force-pushed the remove-group-id-block branch from a730466 to 1fc7432 Compare June 1, 2023 23:00
Group id block was used to carry the max group id to the grouped
accumulator, but the max group id is now set directly with the
setGroupCount method.
An int[] is used because the GroupByHash implementations only create int
group ids.
@dain dain force-pushed the remove-group-id-block branch from 1fc7432 to ca61901 Compare June 5, 2023 02:54
@dain dain merged commit 1657e52 into trinodb:master Jun 6, 2023
@dain dain deleted the remove-group-id-block branch June 6, 2023 03:47
@github-actions github-actions bot added this to the 420 milestone Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants