-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
DataFusion currently has four different benchmark runners for comparing different versions of datafusion with itself, which have substantial overlap in their functionality and code
cargo run --release --bin nyctaxi
cargo run --release --bin h2o
cargo run --release --bin parquet
cargo run --release --bin tpch
I would like to add a 5th one (ClickBench) #6994 but would like to avoid copy/pasting yet again mostly the same thing
Describe the solution you'd like
Thus I propose a new consoidated binary dfbench
run like this:
cargo run --release --bin benchmark -- nyctaxi
cargo run --release --bin benchmark -- h2o
cargo run --release --bin benchmark -- parquet
cargo run --release --bin benchmark -- tpch
cargo run --release --bin benchmark -- clickbench (I want to add this)
Describe alternatives you've considered
Follow the same pattern
Additional context
Task List
- Make consolidated runner
dfbench
: Createdfbench
, split uptpch
benchmark runner into modules #7054 - consolidate tpch: Create
dfbench
, split uptpch
benchmark runner into modules #7054 - Update the benchmark readme
- consolidate nyctaxi
- consolidate h2o -- tracked by Add H2O.ai Database-like Ops benchmark to
dfbench
#7209 - consolidate parquet
- Update bench.sh for new commands
- Move tpch queries into
queries/tpch
rather thanqueries
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request