Skip to content

minor: Add benchmark query and corresponding documentation for Average Duration #16105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 19, 2025

Conversation

logan-keede
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Adds a query benchmark in extended queries of clickbench.
This query benchmarks performance of Average over Duration.

What changes are included in this PR?

Are these changes tested?

Yes, by Running them on my machine.

Are there any user-facing changes?

Nope.

@logan-keede
Copy link
Contributor Author

cc @alamb

@logan-keede logan-keede changed the title ADD query and documentation minor: Add benchmark query and corresponding documentation for Average Duration May 19, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @logan-keede -- this looks really nice to me. Can't wait to see how much faster this query is on #15748 from @shruti2522

cc @shruti2522


**Question**: Which combinations of operating system, region, and user agent exhibit the highest average latency? For each of these combinations, also report the average response time.

**Important Query Properties**: Multiple average of Duration, high cardinality grouping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great -- thank you @logan-keede

| 13631 | 82 | 45 | 0 days 7 hours 23 mins 1.000000000 secs | 0 days 7 hours 23 mins 1.000000000 secs |
+----------+-----------+-----+------------------------------------------+------------------------------------------+
10 row(s) fetched.
Elapsed 30.195 seconds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof -- love that it takes 30 seconds. The new accumulator will be so much faster

@alamb alamb marked this pull request as ready for review May 19, 2025 17:45
@alamb
Copy link
Contributor

alamb commented May 19, 2025

I am just running this quickly locally and then I'll merge it in

@alamb
Copy link
Contributor

alamb commented May 19, 2025

./bench.sh run clickbench_extended
...
Q8: SELECT "RegionID", "UserAgent", "OS", AVG(to_timestamp("ResponseEndTiming")-to_timestamp("ResponseStartTiming")) as avg_response_time, AVG(to_timestamp("ResponseEndTiming")-to_timestamp("ConnectTiming")) as avg_latency FROM hits GROUP BY "RegionID", "UserAgent", "OS" ORDER BY avg_latency DESC limit 10;
Query 8 iteration 0 took 712.4 ms and returned 10 rows
Query 8 iteration 1 took 674.1 ms and returned 10 rows
Query 8 iteration 2 took 705.6 ms and returned 10 rows
Query 8 iteration 3 took 686.4 ms and returned 10 rows
Query 8 iteration 4 took 693.0 ms and returned 10 rows
Query 8 avg time: 694.29 ms
Done

@alamb alamb merged commit ca46932 into apache:main May 19, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Extended Clickbench benchmark for avg(duration)
2 participants