Commit 11088b9
authored
Feature/benchmark config from env (#15782)
* Read benchmark SessionConfig from env
* Set target partitions from env by default
fix
* Set batch size from env by default
* Fix batch size option for tpch ci
* Log environment variable configuration
* Document benchmarking env variable config
* Add DATAFUSION_* env config to Error: unknown command: help
Orchestrates running benchmarks against DataFusion checkouts
Usage:
./bench.sh data [benchmark] [query]
./bench.sh run [benchmark]
./bench.sh compare <branch1> <branch2>
./bench.sh venv
**********
Examples:
**********
# Create the datasets for all benchmarks in /Users/christian/MA/datafusion/benchmarks/data
./bench.sh data
# Run the 'tpch' benchmark on the datafusion checkout in /source/datafusion
DATAFUSION_DIR=/source/datafusion ./bench.sh run tpch
**********
* Commands
**********
data: Generates or downloads data needed for benchmarking
run: Runs the named benchmark
compare: Compares results from benchmark runs
venv: Creates new venv (unless already exists) and installs compare's requirements into it
**********
* Benchmarks
**********
all(default): Data/Run/Compare for all benchmarks
tpch: TPCH inspired benchmark on Scale Factor (SF) 1 (~1GB), single parquet file per table, hash join
tpch_mem: TPCH inspired benchmark on Scale Factor (SF) 1 (~1GB), query from memory
tpch10: TPCH inspired benchmark on Scale Factor (SF) 10 (~10GB), single parquet file per table, hash join
tpch_mem10: TPCH inspired benchmark on Scale Factor (SF) 10 (~10GB), query from memory
cancellation: How long cancelling a query takes
parquet: Benchmark of parquet reader's filtering speed
sort: Benchmark of sorting speed
sort_tpch: Benchmark of sorting speed for end-to-end sort queries on TPCH dataset
clickbench_1: ClickBench queries against a single parquet file
clickbench_partitioned: ClickBench queries against a partitioned (100 files) parquet
clickbench_extended: ClickBench "inspired" queries against a single parquet (DataFusion specific)
external_aggr: External aggregation benchmark
h2o_small: h2oai benchmark with small dataset (1e7 rows) for groupby, default file format is csv
h2o_medium: h2oai benchmark with medium dataset (1e8 rows) for groupby, default file format is csv
h2o_big: h2oai benchmark with large dataset (1e9 rows) for groupby, default file format is csv
h2o_small_join: h2oai benchmark with small dataset (1e7 rows) for join, default file format is csv
h2o_medium_join: h2oai benchmark with medium dataset (1e8 rows) for join, default file format is csv
h2o_big_join: h2oai benchmark with large dataset (1e9 rows) for join, default file format is csv
imdb: Join Order Benchmark (JOB) using the IMDB dataset converted to parquet
**********
* Supported Configuration (Environment Variables)
**********
DATA_DIR directory to store datasets
CARGO_COMMAND command that runs the benchmark binary
DATAFUSION_DIR directory to use (default /Users/christian/MA/datafusion/benchmarks/..)
RESULTS_NAME folder where the benchmark files are stored
PREFER_HASH_JOIN Prefer hash join algorithm (default true)
VENV_PATH Python venv to use for compare and venv commands (default ./venv, override by <your-venv>/bin/activate)
DATAFUSION_* Set the given datafusion configuration
* fmt1 parent 8f5158a commit 11088b9
File tree
10 files changed
+42
-23
lines changed- benchmarks
- src
- bin
- imdb
- tpch
- util
- datafusion/common/src
10 files changed
+42
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
83 | 83 | | |
84 | 84 | | |
85 | 85 | | |
86 | | - | |
| 86 | + | |
| 87 | + | |
87 | 88 | | |
88 | 89 | | |
89 | 90 | | |
90 | 91 | | |
91 | 92 | | |
92 | 93 | | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
93 | 105 | | |
94 | 106 | | |
95 | 107 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
189 | 189 | | |
190 | 190 | | |
191 | 191 | | |
192 | | - | |
| 192 | + | |
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
119 | | - | |
| 119 | + | |
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
80 | | - | |
| 80 | + | |
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
303 | 303 | | |
304 | 304 | | |
305 | 305 | | |
306 | | - | |
| 306 | + | |
307 | 307 | | |
308 | 308 | | |
309 | 309 | | |
| |||
514 | 514 | | |
515 | 515 | | |
516 | 516 | | |
517 | | - | |
| 517 | + | |
518 | 518 | | |
519 | 519 | | |
520 | 520 | | |
| |||
550 | 550 | | |
551 | 551 | | |
552 | 552 | | |
553 | | - | |
| 553 | + | |
554 | 554 | | |
555 | 555 | | |
556 | 556 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
202 | 202 | | |
203 | 203 | | |
204 | 204 | | |
205 | | - | |
| 205 | + | |
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
| 126 | + | |
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
| |||
355 | 355 | | |
356 | 356 | | |
357 | 357 | | |
358 | | - | |
| 358 | + | |
359 | 359 | | |
360 | 360 | | |
361 | 361 | | |
| |||
392 | 392 | | |
393 | 393 | | |
394 | 394 | | |
395 | | - | |
| 395 | + | |
396 | 396 | | |
397 | 397 | | |
398 | 398 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | | - | |
45 | | - | |
| 44 | + | |
| 45 | + | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
69 | | - | |
| 68 | + | |
| 69 | + | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
79 | 82 | | |
80 | 83 | | |
81 | 84 | | |
82 | 85 | | |
| 86 | + | |
83 | 87 | | |
84 | 88 | | |
85 | 89 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
864 | 864 | | |
865 | 865 | | |
866 | 866 | | |
867 | | - | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
868 | 870 | | |
869 | 871 | | |
870 | 872 | | |
| |||
0 commit comments