refactor(bench): use composable argument groups to prevent unused args #5425

joseph-isaacs · 2025-11-20T17:08:42Z

Previously, all benchmarks accepted a single CommonArgs struct via flatten, which meant benchmarks would silently accept arguments they didn't use (e.g., fineweb accepting --scale-factor, statpopgen accepting --use-remote-data-dir).

This refactor splits CommonArgs into focused, semantically-related groups:

CoreArgs: execution basics (iterations, threads, verbose, tracing)
QueryFilterArgs: query selection (queries, exclude_queries)
OutputArgs: display configuration (display_format, output_path, hide_progress_bar)
EngineArgs: engine settings (disable_datafusion_cache, delete_duckdb_database)
DebugArgs: debugging/analysis (export_spans, show_metrics, emit_plan, track_memory, explain, explain_analyze)
DataArgs: data generation (skip_generate)
RemoteDataArgs: remote data configuration (use_remote_data_dir)

Benchmarks now explicitly flatten only the groups they support:

TPC-H/TPC-DS/ClickBench/Fineweb/GhArchive: all groups (including RemoteDataArgs)
StatPopGen: all except RemoteDataArgs (doesn't support remote data)

Benefits:

Compile-time safety: benchmarks can only access args they explicitly include
Self-documenting: struct definition shows what each benchmark supports
clap rejects unknown args: users get immediate feedback if they pass unsupported args
No runtime validation needed: architectural guarantee prevents unused args

🤖 Generated with Claude Code

Previously, all benchmarks accepted a single CommonArgs struct via flatten, which meant benchmarks would silently accept arguments they didn't use (e.g., fineweb accepting --scale-factor, statpopgen accepting --use-remote-data-dir). This refactor splits CommonArgs into focused, semantically-related groups: - CoreArgs: execution basics (iterations, threads, verbose, tracing) - QueryFilterArgs: query selection (queries, exclude_queries) - OutputArgs: display configuration (display_format, output_path, hide_progress_bar) - EngineArgs: engine settings (disable_datafusion_cache, delete_duckdb_database) - DebugArgs: debugging/analysis (export_spans, show_metrics, emit_plan, track_memory, explain, explain_analyze) - DataArgs: data generation (skip_generate) - RemoteDataArgs: remote data configuration (use_remote_data_dir) Benchmarks now explicitly flatten only the groups they support: - TPC-H/TPC-DS/ClickBench/Fineweb/GhArchive: all groups (including RemoteDataArgs) - StatPopGen: all except RemoteDataArgs (doesn't support remote data) Benefits: - Compile-time safety: benchmarks can only access args they explicitly include - Self-documenting: struct definition shows what each benchmark supports - clap rejects unknown args: users get immediate feedback if they pass unsupported args - No runtime validation needed: architectural guarantee prevents unused args 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

codspeed-hq · 2025-11-20T17:17:05Z

CodSpeed Performance Report

Merging #5425 will improve performances by 26.8%

_{Comparing ji/query_bench_val_input (617c9bd) with develop (aaf5245)¹}

Summary

⚡ 7 improvements
✅ 1438 untouched
🆕 7 new
⏩ 729 skipped²
🗄️ 28 archived benchmarks run³

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`take_indices[(1000, 16, 0.005)]`	24.1 µs	21.7 µs	+10.91%
⚡	`take_indices[(1000, 16, 0.01)]`	24.2 µs	21.9 µs	+10.83%
⚡	`take_indices[(1000, 256, 0.005)]`	23.9 µs	21.5 µs	+11.02%
⚡	`take_indices[(1000, 256, 0.01)]`	24.1 µs	21.7 µs	+10.91%
⚡	`take_indices[(1000, 256, 0.03)]`	24.5 µs	22.1 µs	+10.58%
⚡	`take_indices[(10000, 256, 0.005)]`	26 µs	23.7 µs	+10%
⚡	`rebuild_naive`	1.3 ms	1 ms	+26.8%
🆕	`decompress[("alp_for_bp_f64", 0x464da70)]`	N/A	24.2 ms	N/A
🆕	`decompress[("datetime_for_bp", 0x4650930)]`	N/A	34.9 ms	N/A
🆕	`decompress[("dict_fsst_varbin_bp_string", 0x464fc70)]`	N/A	14.5 ms	N/A
🆕	`decompress[("dict_fsst_varbin_string", 0x464f7d0)]`	N/A	14.5 ms	N/A
🆕	`decompress[("dict_varbinview_string", 0x464e490)]`	N/A	14.7 ms	N/A
🆕	`decompress[("for_bp_u64", 0x464d320)]`	N/A	2.5 ms	N/A
🆕	`decompress[("runend_for_bp_u32", 0x464e920)]`	N/A	2 ms	N/A

No successful run was found on develop (71f0f00) during the generation of this report, so aaf5245 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
729 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
28 benchmarks were run, but are now archived. If they were deleted in another branch, consider rebasing to remove them from the report. Instead if they were added back, click here to restore them. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(bench): use composable argument groups to prevent unused args #5425

refactor(bench): use composable argument groups to prevent unused args #5425

Uh oh!

joseph-isaacs commented Nov 20, 2025

Uh oh!

codspeed-hq bot commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor(bench): use composable argument groups to prevent unused args #5425

Are you sure you want to change the base?

refactor(bench): use composable argument groups to prevent unused args #5425

Uh oh!

Conversation

joseph-isaacs commented Nov 20, 2025

Uh oh!

codspeed-hq bot commented Nov 20, 2025

CodSpeed Performance Report

Merging #5425 will improve performances by 26.8%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants