Skip to content

Make ORDER BY tuple almost as fast as ORDER BY columns#34060

Merged
kitaisreal merged 4 commits into
ClickHouse:masterfrom
amosbird:optimizetupleorderby
Jan 29, 2022
Merged

Make ORDER BY tuple almost as fast as ORDER BY columns#34060
kitaisreal merged 4 commits into
ClickHouse:masterfrom
amosbird:optimizetupleorderby

Conversation

@amosbird
Copy link
Copy Markdown
Collaborator

@amosbird amosbird commented Jan 27, 2022

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Make ORDER BY tuple almost as fast as ORDER BY columns. We have special optimizations for multiple column ORDER BY: #10831 . It's beneficial to also apply to tuple columns.

Before:

select * from numbers(300000000) order by (1 - number , number + 1 , number) limit 10;
2.613 sec.

After:

select * from numbers(300000000) order by (1 - number , number + 1 , number) limit 10;
0.755 sec

No tuple:

select * from numbers(300000000) order by 1 - number , number + 1 , number limit 10;
0.755 sec

It's still not 100% the same because we have other optimization techniques like optimize_monotonous_functions_in_order_by, which is not applied to inner columns of a tuple. But it's good enough.

Information about CI checks: https://clickhouse.tech/docs/en/development/continuous-integration/

We have special optimizations for multiple column ORDER BY: ClickHouse#10831 . It's beneficial to also apply to tuple columns.

Before:

select * from numbers(300000000) order by (1 - number , number + 1 , number) limit 10;
2.613 sec.

After:

select * from numbers(300000000) order by (1 - number , number + 1 , number) limit 10;
0.755 sec

No tuple:

select * from numbers(300000000) order by 1 - number , number + 1 , number limit 10;
0.755 sec
@robot-clickhouse robot-clickhouse added the pr-performance Pull request with some performance improvements label Jan 27, 2022
@kitaisreal kitaisreal self-assigned this Jan 27, 2022
Comment thread src/Interpreters/sortBlock.cpp Outdated
ErrorCodes::BAD_COLLATION);
}
if (const auto * tuple = typeid_cast<const ColumnTuple *>(column))
flattenTupleColumnRecursively(res, tuple, &description[i], isColumnConst(*column));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we need to make a lot of changes.
Currently for each column in ColumnTuple we can create own ColumnSortDescription just copy input ColumnSortDescription, remove Collation if needed (for non String types), and then other code will work as expected.

@kitaisreal kitaisreal merged commit f345302 into ClickHouse:master Jan 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants