Skip to content

Conversation

@joseph-isaacs
Copy link
Contributor

Previously, all benchmarks accepted a single CommonArgs struct via flatten, which meant benchmarks would silently accept arguments they didn't use (e.g., fineweb accepting --scale-factor, statpopgen accepting --use-remote-data-dir).

This refactor splits CommonArgs into focused, semantically-related groups:

  • CoreArgs: execution basics (iterations, threads, verbose, tracing)
  • QueryFilterArgs: query selection (queries, exclude_queries)
  • OutputArgs: display configuration (display_format, output_path, hide_progress_bar)
  • EngineArgs: engine settings (disable_datafusion_cache, delete_duckdb_database)
  • DebugArgs: debugging/analysis (export_spans, show_metrics, emit_plan, track_memory, explain, explain_analyze)
  • DataArgs: data generation (skip_generate)
  • RemoteDataArgs: remote data configuration (use_remote_data_dir)

Benchmarks now explicitly flatten only the groups they support:

  • TPC-H/TPC-DS/ClickBench/Fineweb/GhArchive: all groups (including RemoteDataArgs)
  • StatPopGen: all except RemoteDataArgs (doesn't support remote data)

Benefits:

  • Compile-time safety: benchmarks can only access args they explicitly include
  • Self-documenting: struct definition shows what each benchmark supports
  • clap rejects unknown args: users get immediate feedback if they pass unsupported args
  • No runtime validation needed: architectural guarantee prevents unused args

🤖 Generated with Claude Code

Previously, all benchmarks accepted a single CommonArgs struct via flatten,
which meant benchmarks would silently accept arguments they didn't use
(e.g., fineweb accepting --scale-factor, statpopgen accepting
--use-remote-data-dir).

This refactor splits CommonArgs into focused, semantically-related groups:
- CoreArgs: execution basics (iterations, threads, verbose, tracing)
- QueryFilterArgs: query selection (queries, exclude_queries)
- OutputArgs: display configuration (display_format, output_path, hide_progress_bar)
- EngineArgs: engine settings (disable_datafusion_cache, delete_duckdb_database)
- DebugArgs: debugging/analysis (export_spans, show_metrics, emit_plan, track_memory, explain, explain_analyze)
- DataArgs: data generation (skip_generate)
- RemoteDataArgs: remote data configuration (use_remote_data_dir)

Benchmarks now explicitly flatten only the groups they support:
- TPC-H/TPC-DS/ClickBench/Fineweb/GhArchive: all groups (including RemoteDataArgs)
- StatPopGen: all except RemoteDataArgs (doesn't support remote data)

Benefits:
- Compile-time safety: benchmarks can only access args they explicitly include
- Self-documenting: struct definition shows what each benchmark supports
- clap rejects unknown args: users get immediate feedback if they pass unsupported args
- No runtime validation needed: architectural guarantee prevents unused args

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 20, 2025

CodSpeed Performance Report

Merging #5425 will improve performances by 26.8%

Comparing ji/query_bench_val_input (617c9bd) with develop (aaf5245)1

Summary

⚡ 7 improvements
✅ 1438 untouched
🆕 7 new
⏩ 729 skipped2
🗄️ 28 archived benchmarks run3

Benchmarks breakdown

Benchmark BASE HEAD Change
take_indices[(1000, 16, 0.005)] 24.1 µs 21.7 µs +10.91%
take_indices[(1000, 16, 0.01)] 24.2 µs 21.9 µs +10.83%
take_indices[(1000, 256, 0.005)] 23.9 µs 21.5 µs +11.02%
take_indices[(1000, 256, 0.01)] 24.1 µs 21.7 µs +10.91%
take_indices[(1000, 256, 0.03)] 24.5 µs 22.1 µs +10.58%
take_indices[(10000, 256, 0.005)] 26 µs 23.7 µs +10%
rebuild_naive 1.3 ms 1 ms +26.8%
🆕 decompress[("alp_for_bp_f64", 0x464da70)] N/A 24.2 ms N/A
🆕 decompress[("datetime_for_bp", 0x4650930)] N/A 34.9 ms N/A
🆕 decompress[("dict_fsst_varbin_bp_string", 0x464fc70)] N/A 14.5 ms N/A
🆕 decompress[("dict_fsst_varbin_string", 0x464f7d0)] N/A 14.5 ms N/A
🆕 decompress[("dict_varbinview_string", 0x464e490)] N/A 14.7 ms N/A
🆕 decompress[("for_bp_u64", 0x464d320)] N/A 2.5 ms N/A
🆕 decompress[("runend_for_bp_u32", 0x464e920)] N/A 2 ms N/A

Footnotes

  1. No successful run was found on develop (71f0f00) during the generation of this report, so aaf5245 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 729 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  3. 28 benchmarks were run, but are now archived. If they were deleted in another branch, consider rebasing to remove them from the report. Instead if they were added back, click here to restore them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants