Shell script to collect benchmarks for multiple versions #15144

logan-keede · 2025-03-11T05:59:29Z

Which issue does this PR close?

Part of Run DataFusion benchmarks regularly and track performance history over time #5504

Rationale for this change

Here is a suggestion on how to proceed with this project:

Create the converter from bench json --> line protocol (e.g. Export benchmark information as line protocol #6107)

Write a script that runs the bench.sh script to gather the clickbench performance numbers over the last 5 releases:
git checkout 40.0.0
./bench.sh run clickbench_partitioned
git checkout 41.0.0
./bench.sh run clickbench_partitioned
...
git checkout 45.0.0
./bench.sh run clickbench_partitioned
And then load/plot that data using Grafana.

Originally posted by @alamb in #5504

What changes are included in this PR?

Just a simple script to collect benchmarks for last 5 releases

Are these changes tested?

Yes, by running the script on my laptop.
using sh collect_bench.sh tpch.

Are there any user-facing changes?

Nope.

logan-keede · 2025-03-11T06:10:56Z

One problem that I sometimes encounter is that cargo decides to use arrow-arith v53.4.0 for particular releases which ends up giving compilation error.
I’m not sure why this happens. Sometimes, for the same script and same release, cargo decides to use some other version of arrow-arith and everything works fine.
If anyone knows a workaround, please let me know.

alamb · 2025-03-12T22:09:20Z

Sorry @logan-keede -- I am very excited to try this one out, but I ran otu of time today

logan-keede · 2025-03-12T22:51:09Z

Sorry @logan-keede -- I am very excited to try this one out, but I ran otu of time today

No problem at all! I look forward to your feedback whenever you have the time.

comphead

Thanks @logan-keede that is pretty neat, appreciate if you document the script and its usage in benchmarks/README.md

logan-keede · 2025-03-13T20:20:11Z

Just noticed that benchmarks/README.md is not included in prettier CI check, is that intended?

alamb · 2025-03-18T20:12:36Z

Just noticed that benchmarks/README.md is not included in prettier CI check, is that intended?

I don't think it is intended

And I am sorry I haven't had time to test out this PR yet

logan-keede · 2025-03-19T09:56:30Z

Just noticed that benchmarks/README.md is not included in prettier CI check, is that intended?

I don't think it is intended

And I am sorry I haven't had time to test out this PR yet

I pushed a commit to add benchmarks to prettier check list.

No problem, I understand it might be a bit to lengthy and resource intensive to test.
If so, you could try limiting it to 4 threads and edit cargo command in bench.sh to remove release flag(temp) while handling other tasks(may not work with other builds though). Otherwise, feel free to ignore this.

alamb · 2025-04-30T13:06:27Z

Thank you for this work @logan-keede -- I am sorry for the very long delay. It seems @saraghds is also working on this script so I am going to try and help it along.

I merged this branch up from main and added some more comments. I also would like to make the versions it checks a bit more configurable -- I will test and work on that over the day

alamb · 2025-04-30T13:08:15Z

I want this script to be useable by more people so I think it is important to document it a bit more

I am also going to investigate potentially overriding the list of git commits to run rather than assuming what versions.

alamb · 2025-05-27T16:26:28Z

Sorry for the long delay in feedback -- I have been working on other things.

What I really want is a way to track DataFusion performance over time during development (so not released version numbers) so that as new changes are added to the code we can see how the overall trend is doing.

Given the initial state of this PR, the best they can do is run once a month after a release is done.

So since they don't really satisfy the need yet (being able to get a handle on ongoing performane) I haven't spent a lot of time reviewing / working them.

BTW here is the kind of thing I want to see / create:
https://benchmarks.mikemccandless.com/

logan-keede added 2 commits March 11, 2025 11:20

First Iteration

42210da

Add License

e09b528

logan-keede added 2 commits March 11, 2025 11:44

Add comments to explain the purpose of script better

58a4ca9

remove unnecesary echo

7d1d33b

alamb mentioned this pull request Mar 12, 2025

Run DataFusion benchmarks regularly and track performance history over time #5504

Open

alamb mentioned this pull request Mar 12, 2025

Weekly Plan (Andrew Lamb) March 10, 2025 #15121

Closed

11 tasks

comphead reviewed Mar 13, 2025

View reviewed changes

logan-keede added 3 commits March 13, 2025 23:33

documentation

1ac6576

somehow the actual documentation got left out

df844c0

version of prettier

e4a3778

add benchmarks to prettier check list

ed53663

github-actions bot added the development-process Related to development process of DataFusion label Mar 19, 2025

logan-keede added 2 commits March 19, 2025 15:15

Well we know it works now

1098c00

fix: space

04e8c9f

alamb mentioned this pull request Apr 28, 2025

Weekly Plan: Andrew Lamb 2025-04-28 #15880

Closed

26 tasks

alamb added 4 commits April 30, 2025 08:45

Merge remote-tracking branch 'apache/main' into bench_collection

f7974b7

resolve conflicts

ef39bcc

tweak docs

de37cd8

Update docs / add error checking

c7f8a9f

alamb changed the title ~~shell script to collect Benchmarks~~ Shell script to collect benchmarks for multiple versions Apr 30, 2025

alamb mentioned this pull request Apr 30, 2025

feat(benchmark): collect benchmarks for last 5 versions in line protocol format #15846

Draft

make it executable

888d9b9

alamb added 2 commits April 30, 2025 09:11

fix arg checking, syntax

f2b9715

Updates

fc0f5ba

This was referenced May 5, 2025

Weekly Plan: Andrew Lamb 2025-05-05 #15943

Closed

Weekly Plan: Andrew Lamb 2025-05-12 #16022

Closed

alamb mentioned this pull request May 19, 2025

Weekly Plan: Andrew Lamb 2025-05-19 #16101

Closed

18 tasks

alamb mentioned this pull request May 27, 2025

Weekly Plan: Andrew Lamb 2025-05-26 #16198

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Shell script to collect benchmarks for multiple versions #15144

Shell script to collect benchmarks for multiple versions #15144

Uh oh!

logan-keede commented Mar 11, 2025

Uh oh!

logan-keede commented Mar 11, 2025

Uh oh!

alamb commented Mar 12, 2025

Uh oh!

logan-keede commented Mar 12, 2025

Uh oh!

comphead left a comment

Uh oh!

logan-keede commented Mar 13, 2025

Uh oh!

alamb commented Mar 18, 2025

Uh oh!

logan-keede commented Mar 19, 2025 •

edited

Loading

Uh oh!

alamb commented Apr 30, 2025

Uh oh!

alamb commented Apr 30, 2025

Uh oh!

alamb commented May 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Shell script to collect benchmarks for multiple versions #15144

Are you sure you want to change the base?

Shell script to collect benchmarks for multiple versions #15144

Uh oh!

Conversation

logan-keede commented Mar 11, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

logan-keede commented Mar 11, 2025

Uh oh!

alamb commented Mar 12, 2025

Uh oh!

logan-keede commented Mar 12, 2025

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

logan-keede commented Mar 13, 2025

Uh oh!

alamb commented Mar 18, 2025

Uh oh!

logan-keede commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Apr 30, 2025

Uh oh!

alamb commented Apr 30, 2025

Uh oh!

alamb commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

logan-keede commented Mar 19, 2025 •

edited

Loading

alamb commented May 27, 2025 •

edited

Loading