-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add benchmark #101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add benchmark #101
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #101 +/- ##
=======================================
Coverage 94.48% 94.48%
=======================================
Files 3 3
Lines 145 145
=======================================
Hits 137 137
Misses 8 8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Caution Review failedThe pull request is closed. WalkthroughAdds a benchmarking suite: a Python benchmark script, two pre-commit hook configs, docs, a GitHub Actions workflow, and .gitignore entries; updates Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Script as testing/benchmark_hooks.py
participant Remote as RemoteRepo
participant Git as git
participant PreCommit as pre-commit
participant FS as Filesystem
User->>Script: run script
Script->>Remote: git_clone() -> populate `testing/test-examples` (errors ignored)
Script->>Script: discover TARGET_FILES, set HOOKS & REPEATS
loop per HOOK
Note right of Script #D3E4CD: select hook config
loop REPEATS times
Script->>PreCommit: run --config <hook.config> --files <targets> (timed)
PreCommit->>FS: lint/format files (may run `ruff --fix`)
PreCommit-->>Script: return status (errors swallowed)
Script->>Git: safe_git_restore(files) -> restore tracked state
Script-->>Script: record elapsed time
end
end
Script->>FS: write `testing/benchmark_results.txt`
Script-->>User: print per-run timings and aggregated stats
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (7)
docs/benchmark.md (2)
5-9
: Document prerequisites before the command.Add a short prerequisites list so users don’t hit missing-tool errors.
## Running the Benchmark +Prerequisites: + +- Python 3.8+ +- git (2.23+ recommended for `git restore`) +- pre-commit in PATH (`pipx install pre-commit` or `pip install pre-commit`) + ```bash python3 testing/benchmark_hooks.py--- `17-18`: **Grammar: plural “benchmarks”.** Minor wording polish. ```diff -- Run benchmark with GitHub Actions for continuous integration. +- Run benchmarks with GitHub Actions for continuous integration.
testing/benchmark_hooks.py (5)
1-1
: Shebang vs. invocation mismatch.Either make the file executable and document
./testing/benchmark_hooks.py
, or drop the shebang since docs usepython3 ...
.-#!/usr/bin/env python3 +#!/usr/bin/env python3Follow-up: If you keep the shebang, add
chmod +x testing/benchmark_hooks.py
to docs, or remove the shebang.
5-14
: Align usage text with actual path and clarify targets.Docstring says
python benchmark_hooks.py
; repo path istesting/benchmark_hooks.py
. Also reflect auto-discovery of targets.-Usage: - python benchmark_hooks.py +Usage: + python3 testing/benchmark_hooks.py @@ -- Target files: testing/main.c (or adjust as needed) +- Target files: auto-discovered under testing/ (C/C++ sources/headers)
40-42
: Allow REPEATS override via env var.Keeps defaults but enables quick tuning without editing code.
-REPEATS = 5 +import os # at top with other imports +REPEATS = int(os.getenv("BENCH_REPEATS", "5"))Note: Add
import os
next to other imports at Line 16.
72-87
: Optionally emit a machine-readable artifact and timestamp.Keeps the human summary and adds durability for future comparisons.
def report(results): lines = [] for name, times in results.items(): @@ print("\nBenchmark Results:") print("\n".join(lines)) with open(RESULTS_FILE, "w") as f: - f.write("\n".join(lines) + "\n") + f.write("\n".join(lines) + "\n") + # Optional: also write JSON for automation + # import json, datetime as dt + # with open(RESULTS_FILE.replace(".txt", ".json"), "w") as jf: + # jf.write(json.dumps({"timestamp": dt.datetime.utcnow().isoformat() + "Z", "results": results}, indent=2)) print(f"Results saved to {RESULTS_FILE}")Happy to wire up a small JSON emitter if you want this in CI.
89-96
: Pre-flight checks for required tools.Fast-fail if git or pre-commit are missing.
def main(): + import shutil + for tool in ("git", "pre-commit"): + if shutil.which(tool) is None: + raise SystemExit(f"Required tool not found in PATH: {tool}") results = benchmark() report(results)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
.gitignore
(1 hunks)docs/benchmark.md
(1 hunks)testing/benchmark_hooks.py
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/benchmark.md
[grammar] ~17-~17: There might be a mistake here.
Context: ...//github.com/TheAlgorithms/C-Plus-Plus). - Run benchmark with GitHub Actions for co...
(QB_NEW_EN)
🪛 Ruff (0.12.2)
testing/benchmark_hooks.py
1-1: Shebang is present but file is not executable
(EXE001)
45-45: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
48-48: subprocess
call: check for execution of untrusted input
(S603)
63-63: subprocess
call: check for execution of untrusted input
(S603)
63-63: Consider ["git", "restore", *TARGET_FILES]
instead of concatenation
Replace with ["git", "restore", *TARGET_FILES]
(RUF005)
64-64: Starting a process with a partial executable path
(S607)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.14)
🔇 Additional comments (1)
.gitignore (1)
15-15
: LGTM: ignore the benchmark results file.Good addition to keep generated artifacts out of VCS.
## Results | ||
|
||
The results of the benchmarking process will be saved to `testing/benchmark_results.txt`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Call out cache-warming to ensure fair, stable timings.
Make it explicit that cold-cache runs will dominate timings unless caches are warmed.
## Results
-The results of the benchmarking process will be saved to `testing/benchmark_results.txt`.
+The results of the benchmarking process will be saved to `testing/benchmark_results.txt`.
+
+Note: For fair comparisons, warm pre-commit caches once per hook (do not clean between repeats). If you need cold-cache numbers, run a separate pass that cleans caches before the first run only.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
## Results | |
The results of the benchmarking process will be saved to `testing/benchmark_results.txt`. | |
## Results | |
The results of the benchmarking process will be saved to `testing/benchmark_results.txt`. | |
Note: For fair comparisons, warm pre-commit caches once per hook (do not clean between repeats). If you need cold-cache numbers, run a separate pass that cleans caches before the first run only. |
🤖 Prompt for AI Agents
In docs/benchmark.md around lines 11 to 13, the Results section does not mention
cache-warming and thus can mislead readers because cold-cache runs will dominate
timings; update the text to call out cache-warming explicitly and prescribe a
procedure: run a number of warm-up iterations before timed runs (or document how
to clear caches for cold runs), run each benchmark multiple times, report both
cold and warm timings or the averaged warm timings, and note any
cache/state-reset steps used so results are fair and reproducible.
def run_hook(config, files): | ||
cmd = ["pre-commit", "run", "--config", config, "--files"] + files | ||
start = time.perf_counter() | ||
try: | ||
subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) | ||
except subprocess.CalledProcessError: | ||
# Still record time even if hook fails | ||
pass | ||
end = time.perf_counter() | ||
return end - start | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Harden subprocess call: use iterable unpacking, capture_output, and a timeout.
Prevents hangs from blocking hooks and addresses style nits.
-def run_hook(config, files):
- cmd = ["pre-commit", "run", "--config", config, "--files"] + files
+def run_hook(config, files):
+ cmd = ["pre-commit", "run", "--config", config, "--files", *files]
start = time.perf_counter()
try:
- subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+ subprocess.run(cmd, check=True, capture_output=True, timeout=300)
except subprocess.CalledProcessError:
# Still record time even if hook fails
pass
+ except subprocess.TimeoutExpired:
+ # Record as a timeout-run; caller still gets elapsed wall time
+ pass
end = time.perf_counter()
return end - start
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def run_hook(config, files): | |
cmd = ["pre-commit", "run", "--config", config, "--files"] + files | |
start = time.perf_counter() | |
try: | |
subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) | |
except subprocess.CalledProcessError: | |
# Still record time even if hook fails | |
pass | |
end = time.perf_counter() | |
return end - start | |
def run_hook(config, files): | |
cmd = ["pre-commit", "run", "--config", config, "--files", *files] | |
start = time.perf_counter() | |
try: | |
subprocess.run(cmd, check=True, capture_output=True, timeout=300) | |
except subprocess.CalledProcessError: | |
# Still record time even if hook fails | |
pass | |
except subprocess.TimeoutExpired: | |
# Record as a timeout-run; caller still gets elapsed wall time | |
pass | |
end = time.perf_counter() | |
return end - start |
🧰 Tools
🪛 Ruff (0.12.2)
45-45: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
48-48: subprocess
call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
In testing/benchmark_hooks.py around lines 44-54, the subprocess.run call should
be hardened: construct the command using iterable unpacking for files (e.g.,
["pre-commit","run","--config", config,"--files", *files]), call subprocess.run
with capture_output=True instead of stdout/stderr, and pass a sensible timeout
(e.g., timeout=300) to avoid hangs; catch both subprocess.CalledProcessError and
subprocess.TimeoutExpired (still recording elapsed time and returning it) so
failing or timed-out hooks don't block the benchmark.
testing/benchmark_hooks.py
Outdated
def benchmark(): | ||
results = {} | ||
for hook in HOOKS: | ||
times = [] | ||
print(f"Benchmarking {hook['name']}...") | ||
for i in range(REPEATS): | ||
# Clean up any changes before each run | ||
subprocess.run(["git", "restore"] + TARGET_FILES) | ||
subprocess.run(["pre-commit", "clean"]) | ||
t = run_hook(hook["config"], TARGET_FILES) | ||
print(f" Run {i + 1}: {t:.3f} seconds") | ||
times.append(t) | ||
results[hook["name"]] = times | ||
return results | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Don’t clean caches on every repeat; warm once per hook and restore files correctly.
Cleaning inside the loop measures install time/network, not hook performance. Also restore both worktree and index to HEAD for deterministic state.
def benchmark():
results = {}
for hook in HOOKS:
times = []
print(f"Benchmarking {hook['name']}...")
- for i in range(REPEATS):
- # Clean up any changes before each run
- subprocess.run(["git", "restore"] + TARGET_FILES)
- subprocess.run(["pre-commit", "clean"])
- t = run_hook(hook["config"], TARGET_FILES)
+ # Ensure no cross-hook cache effects, but keep caches across repeats
+ subprocess.run(["pre-commit", "clean"], check=False)
+ # Warm caches once (untimed)
+ print(" Warming caches (not timed)...")
+ _ = run_hook(hook["config"], TARGET_FILES)
+ for i in range(REPEATS):
+ # Reset files to a known clean state before each timed run
+ subprocess.run(
+ ["git", "restore", "--staged", "--worktree", "--source", "HEAD", "--", *TARGET_FILES],
+ check=False,
+ )
+ t = run_hook(hook["config"], TARGET_FILES)
print(f" Run {i + 1}: {t:.3f} seconds")
times.append(t)
results[hook["name"]] = times
return results
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def benchmark(): | |
results = {} | |
for hook in HOOKS: | |
times = [] | |
print(f"Benchmarking {hook['name']}...") | |
for i in range(REPEATS): | |
# Clean up any changes before each run | |
subprocess.run(["git", "restore"] + TARGET_FILES) | |
subprocess.run(["pre-commit", "clean"]) | |
t = run_hook(hook["config"], TARGET_FILES) | |
print(f" Run {i + 1}: {t:.3f} seconds") | |
times.append(t) | |
results[hook["name"]] = times | |
return results | |
def benchmark(): | |
results = {} | |
for hook in HOOKS: | |
times = [] | |
print(f"Benchmarking {hook['name']}...") | |
# Ensure no cross-hook cache effects, but keep caches across repeats | |
subprocess.run(["pre-commit", "clean"], check=False) | |
# Warm caches once (untimed) | |
print(" Warming caches (not timed)...") | |
_ = run_hook(hook["config"], TARGET_FILES) | |
for i in range(REPEATS): | |
# Reset files to a known clean state before each timed run | |
subprocess.run( | |
["git", "restore", "--staged", "--worktree", "--source", "HEAD", "--", *TARGET_FILES], | |
check=False, | |
) | |
t = run_hook(hook["config"], TARGET_FILES) | |
print(f" Run {i + 1}: {t:.3f} seconds") | |
times.append(t) | |
results[hook["name"]] = times | |
return results |
🧰 Tools
🪛 Ruff (0.12.2)
63-63: subprocess
call: check for execution of untrusted input
(S603)
63-63: Consider ["git", "restore", *TARGET_FILES]
instead of concatenation
Replace with ["git", "restore", *TARGET_FILES]
(RUF005)
64-64: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
In testing/benchmark_hooks.py around lines 56–70: move cache cleaning/warmup out
of the per-run loop and ensure you restore both worktree and index to HEAD for
deterministic state; specifically, for each hook run pre-commit clean once and
perform a single “warm” invocation (to avoid measuring install/network time),
then inside the REPEATS loop restore the repository to HEAD (resetting both
index and worktree, e.g. git reset --hard HEAD) before timing each run; remove
the per-iteration pre-commit clean so only the warming step cleans caches once
per hook.
Added CodSpeed badge to benchmarking documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
docs/benchmark.md (1)
13-16
: Document cache-warming to avoid misleading timings.Cold-cache runs dominate wall time; prescribe how to warm caches and how to collect cold vs warm numbers.
## Results The results of the benchmarking process will be saved to `testing/benchmark_results.txt`. + +Note: For fair comparisons, warm pre-commit caches before timed runs and do not clean caches between repeats. If you need cold-cache numbers, do a separate pass that cleans caches only before the first run, then report both cold and warm (or averaged warm) timings. Record environment details (CPU, OS, pre-commit and clang-format versions) alongside results for reproducibility.
🧹 Nitpick comments (3)
docs/benchmark.md (3)
7-11
: Add a brief “Prerequisites” section to make the benchmark reproducible.Call out required tools and versions so users don’t hit missing deps or version drift.
## Running the Benchmark + +## Prerequisites + +- Python 3.8+. +- pre-commit installed and on PATH (e.g., `pip install pre-commit`). +- clang-format installed and on PATH (pin the version you want to compare). +- Run from the repo root. + ## Running the Benchmarkpython3 testing/benchmark_hooks.py
17-20
: Grammar/style nit in To Do item.Use plural “benchmarks” and “in” for smoother phrasing.
-- Run benchmark with GitHub Actions for continuous integration. +- Run benchmarks in GitHub Actions for continuous integration.
3-3
: Optional: Clarify the badge’s scope vs this local benchmark.One sentence noting that CodSpeed tracks CI micro/benchmarks while this script measures local hook runtime will reduce confusion.
[](https://codspeed.io/cpp-linter/cpp-linter-hooks) + +The CodSpeed badge reflects CI-side performance metrics; the script below measures local pre-commit hook runtime on your machine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
docs/benchmark.md
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/benchmark.md
[grammar] ~19-~19: There might be a mistake here.
Context: ...//github.com/TheAlgorithms/C-Plus-Plus). - Run benchmark with GitHub Actions for co...
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.14)
dd0f9d0
to
c21a796
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
testing/benchmark_hooks.py (3)
56-66
: Harden subprocess call; avoid list concatenation.Unpack files, capture output to prevent blocking, and add a timeout so blocked hooks don't stall the benchmark.
def run_hook(config, files): - cmd = ["pre-commit", "run", "--config", config, "--files"] + files + cmd = ["pre-commit", "run", "--config", config, "--files", *files] start = time.perf_counter() try: - subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) - except subprocess.CalledProcessError: + subprocess.run(cmd, check=True, capture_output=True, timeout=300) + except (subprocess.CalledProcessError, subprocess.TimeoutExpired): # Still record time even if hook fails pass end = time.perf_counter() return end - start
83-95
: Warm caches once per hook; don't clean on every iteration; restore files before each timed run.Cleaning per-iteration measures install/network, not hook performance, and the current restore is ineffective (fixed above). Warm once, then time repeated runs from a clean worktree.
def benchmark(): results = {} for hook in HOOKS: times = [] print(f"Benchmarking {hook['name']}...") - for i in range(REPEATS): - safe_git_restore(TARGET_FILES) - subprocess.run(["pre-commit", "clean"]) - t = run_hook(hook["config"], TARGET_FILES) + # Ensure no cross-hook cache effects, but keep caches across repeats + subprocess.run(["pre-commit", "clean"], check=False) + # Warm caches once (not timed) + _ = run_hook(hook["config"], TARGET_FILES) + for i in range(REPEATS): + safe_git_restore(TARGET_FILES) + t = run_hook(hook["config"], TARGET_FILES) print(f" Run {i + 1}: {t:.3f} seconds") times.append(t) results[hook["name"]] = times return results
32-36
: Fix file discovery (non-recursive pattern), sort deterministically, and fail fast when empty.Current pattern testing/test-examples/*.c only matches top-level files; most examples live in nested dirs, so they’re skipped. Also add deterministic ordering and a clear error if cloning fails or no files are found.
-# Automatically find all C/C++ files in testing/ (and optionally src/, include/) -TARGET_FILES = glob.glob("testing/test-examples/*.c", recursive=True) +# Automatically find all C files under testing/test-examples/ +TARGET_FILES = sorted(glob.glob("testing/test-examples/**/*.c", recursive=True)) +if not TARGET_FILES: + raise SystemExit( + "No C target files found under testing/test-examples/. Did git_clone() succeed?" + )
🧹 Nitpick comments (5)
.pre-commit-config.yaml (1)
19-21
: Consider failing when Ruff applies fixes to avoid committing auto-mutated files.Add --exit-non-zero-on-fix so a fixing run stops the commit and prompts a rerun. Keeps CI/local behavior explicit.
- - id: ruff - args: [--fix] + - id: ruff + args: [--fix, --exit-non-zero-on-fix]testing/benchmark_hooks.py (4)
1-14
: Shebang vs. invocation and docstring accuracy.Either make the file executable or drop the shebang; also update the “Target files” note to match the code (the script discovers testing/test-examples/**/*.c).
-#!/usr/bin/env python3 """ Benchmark script to compare performance of cpp-linter-hooks vs mirrors-clang-format. Usage: - python benchmark_hooks.py + python3 testing/benchmark_hooks.py Requirements: - pre-commit must be installed and available in PATH - Two config files: - testing/pre-commit-config-cpp-linter-hooks.yaml - testing/pre-commit-config-mirrors-clang-format.yaml -- Target files: testing/main.c (or adjust as needed) +- Target files: auto-discovered under testing/test-examples/**/*.c """
16-20
: Import Path for later filesystem checks.Prepare for path-aware operations (used below).
import subprocess import time import statistics import glob +from pathlib import Path
39-54
: Make cloning idempotent and more robust.Skip cloning when the destination repo already exists; create parent dirs; optionally capture output to avoid noisy logs.
-def git_clone(): - try: - subprocess.run( - [ - "git", - "clone", - "--depth", - "1", - "https://github.com/gouravthakur39/beginners-C-program-examples.git", - "testing/test-examples", - ], - check=True, - ) - except subprocess.CalledProcessError: - pass +def git_clone(): + dest = Path("testing/test-examples") + if (dest / ".git").exists(): + return + dest.parent.mkdir(parents=True, exist_ok=True) + try: + subprocess.run( + [ + "git", + "clone", + "--depth", + "1", + "https://github.com/gouravthakur39/beginners-C-program-examples.git", + str(dest), + ], + check=True, + capture_output=True, + timeout=300, + ) + except (subprocess.CalledProcessError, subprocess.TimeoutExpired): + # Let later TARGET_FILES check fail fast with a clear message + pass
98-106
: Quiet the B007 warning and simplify.Loop only over names to size the first column; the times binding was unused.
- for name, times in results.items(): + for name in results: col_widths[0] = max(col_widths[0], len(name))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
.gitignore
(1 hunks).pre-commit-config.yaml
(1 hunks)testing/benchmark_hooks.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .gitignore
🧰 Additional context used
🪛 Ruff (0.12.2)
testing/benchmark_hooks.py
1-1: Shebang is present but file is not executable
(EXE001)
42-49: Starting a process with a partial executable path
(S607)
57-57: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
60-60: subprocess
call: check for execution of untrusted input
(S603)
72-72: subprocess
call: check for execution of untrusted input
(S603)
73-73: Starting a process with a partial executable path
(S607)
80-80: subprocess
call: check for execution of untrusted input
(S603)
80-80: Consider ["git", "restore", *tracked]
instead of concatenation
Replace with ["git", "restore", *tracked]
(RUF005)
90-90: Starting a process with a partial executable path
(S607)
102-102: Loop control variable times
not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.14)
testing/benchmark_hooks.py
Outdated
HOOKS = [ | ||
{ | ||
"name": "cpp-linter-hooks", | ||
"config": "testing/pre-commit-config-cpp-linter-hooks.yaml", | ||
}, | ||
{ | ||
"name": "mirrors-clang-format", | ||
"config": "testing/pre-commit-config-mirrors-clang-format.yaml", | ||
}, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify configs exist before running.
A missing testing/pre-commit-config-*.yaml will fail late. Add a preflight check and exit with a clear message.
Proposed check (add near HOOKS or in main):
for cfg in (h["config"] for h in HOOKS):
if not Path(cfg).is_file():
raise SystemExit(f"Missing pre-commit config: {cfg}")
To verify in your branch:
🏁 Script executed:
#!/bin/bash
set -e
test -f testing/pre-commit-config-cpp-linter-hooks.yaml && echo "OK: cpp-linter-hooks config present"
test -f testing/pre-commit-config-mirrors-clang-format.yaml && echo "OK: mirrors-clang-format config present"
Length of output: 128
🏁 Script executed:
#!/bin/bash
ls -la testing
Length of output: 910
Add preflight check and fix config paths
HOOKS
liststesting/pre-commit-config-cpp-linter-hooks.yaml
andtesting/pre-commit-config-mirrors-clang-format.yaml
, but neither file exists undertesting/
. Add the missing files or updateHOOKS
to reference the correctpre-commit-config*.yaml
filenames.- Before loading, validate each
config
path and exit with a clear error if it’s missing, e.g.:for cfg in (h["config"] for h in HOOKS): if not Path(cfg).is_file(): raise SystemExit(f"Missing pre-commit config: {cfg}")
🤖 Prompt for AI Agents
testing/benchmark_hooks.py lines 21-30: HOOKS contains config paths that don't
exist under testing/ and there's no preflight validation; either correct the
filenames in HOOKS to the actual pre-commit-config*.yaml locations (or add the
missing files into testing/), and add a preflight check before using HOOKS that
iterates over each h["config"], uses pathlib.Path(cfg).is_file() to verify
existence, and raises SystemExit(f"Missing pre-commit config: {cfg}") (ensure
pathlib.Path is imported if not already).
def safe_git_restore(files): | ||
# Only restore files tracked by git | ||
tracked = [] | ||
for f in files: | ||
result = subprocess.run( | ||
["git", "ls-files", "--error-unmatch", f], | ||
stdout=subprocess.PIPE, | ||
stderr=subprocess.PIPE, | ||
) | ||
if result.returncode == 0: | ||
tracked.append(f) | ||
if tracked: | ||
subprocess.run(["git", "restore"] + tracked) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restoring via top-level Git misses cloned examples (nested repo).
Files in testing/test-examples are not tracked by the top-level repo, so no restore happens and runs are not independent. Reset the nested repo to HEAD between runs.
-def safe_git_restore(files):
- # Only restore files tracked by git
- tracked = []
- for f in files:
- result = subprocess.run(
- ["git", "ls-files", "--error-unmatch", f],
- stdout=subprocess.PIPE,
- stderr=subprocess.PIPE,
- )
- if result.returncode == 0:
- tracked.append(f)
- if tracked:
- subprocess.run(["git", "restore"] + tracked)
+def safe_git_restore(_files):
+ # Reset the cloned examples repository (nested Git repo) to a clean state.
+ examples_repo = Path("testing/test-examples")
+ if (examples_repo / ".git").exists():
+ subprocess.run(
+ ["git", "-C", str(examples_repo), "reset", "--hard", "HEAD"],
+ check=False,
+ capture_output=True,
+ )
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.12.2)
72-72: subprocess
call: check for execution of untrusted input
(S603)
73-73: Starting a process with a partial executable path
(S607)
80-80: subprocess
call: check for execution of untrusted input
(S603)
80-80: Consider ["git", "restore", *tracked]
instead of concatenation
Replace with ["git", "restore", *tracked]
(RUF005)
🤖 Prompt for AI Agents
testing/benchmark_hooks.py around lines 68-81: the current safe_git_restore only
checks top-level git and misses files inside nested/cloned repos
(testing/test-examples), so those examples aren't reset between runs; update the
function to detect files that are not tracked by the top-level repo, locate
their nearest enclosing git repository root (walk up parents for a .git
directory), and for each distinct nested repo run a git reset --hard HEAD (and
optionally git clean -fd) via subprocess (using git -C <repo_root> ...) to
guarantee the nested repo is reset between runs before restoring top-level
tracked files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (5)
testing/benchmark_hooks.py (5)
32-36
: Critical: TARGET_FILES computed before cloning → always empty.Since git_clone() runs later, TARGET_FILES is empty and your benchmark runs with no files. Discover files after cloning, sort for determinism, and fail fast if none.
-# Automatically find all C/C++ files in testing/ (and optionally src/, include/) -TARGET_FILES = glob.glob("testing/test-examples/*.c", recursive=True) - -REPEATS = 5 -RESULTS_FILE = "testing/benchmark_results.txt" +def discover_target_files(): + files = sorted(glob.glob("testing/test-examples/**/*.c", recursive=True)) + if not files: + raise SystemExit("No C target files found under testing/test-examples/.") + return files + +REPEATS = 5 +RESULTS_FILE = "testing/benchmark_results.txt"
56-66
: Harden subprocess call and avoid hangs.Use iterable unpacking, capture_output, and a timeout; also handle timeouts.
-def run_hook(config, files): - cmd = ["pre-commit", "run", "--config", config, "--files"] + files +def run_hook(config, files): + cmd = ["pre-commit", "run", "--config", config, "--files", *files] start = time.perf_counter() try: - subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) + subprocess.run(cmd, check=True, capture_output=True, timeout=300) except subprocess.CalledProcessError: # Still record time even if hook fails pass + except subprocess.TimeoutExpired: + # Record elapsed time even if the hook times out + pass end = time.perf_counter() return end - start
68-81
: Restore nested repo and use safe argument handling.Top-level git restore won’t touch testing/test-examples (a nested repo). Reset the nested repo between runs; also use unpacking and capture_output.
-def safe_git_restore(files): - # Only restore files tracked by git - tracked = [] - for f in files: - result = subprocess.run( - ["git", "ls-files", "--error-unmatch", f], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, - ) - if result.returncode == 0: - tracked.append(f) - if tracked: - subprocess.run(["git", "restore"] + tracked) +def safe_git_restore(files): + # Reset the cloned examples repository (nested Git repo) + examples_repo = Path("testing/test-examples") + if (examples_repo / ".git").exists(): + subprocess.run( + ["git", "-C", str(examples_repo), "reset", "--hard", "HEAD"], + check=False, + capture_output=True, + ) + # Restore any files tracked by the top-level repo (if present) + tracked: list[str] = [] + for f in files: + result = subprocess.run( + ["git", "ls-files", "--error-unmatch", f], + capture_output=True, + check=False, + ) + if result.returncode == 0: + tracked.append(f) + if tracked: + subprocess.run(["git", "restore", "--", *tracked], check=False)
83-95
: Benchmark methodology: don’t clean caches each run; warm once per hook.Cleaning per-iteration measures install/network time, not hook performance. Clean once per hook, do an untimed warm-up, then time repeats with deterministic restores.
-def benchmark(): +def benchmark(target_files): results = {} for hook in HOOKS: times = [] print(f"\nBenchmarking {hook['name']}...") - for i in range(REPEATS): - safe_git_restore(TARGET_FILES) - subprocess.run(["pre-commit", "clean"]) - t = run_hook(hook["config"], TARGET_FILES) + # Ensure no cross-hook cache effects, but keep caches across repeats + subprocess.run(["pre-commit", "clean"], check=False) + print(" Warming caches (not timed)...") + _ = run_hook(hook["config"], target_files) + for i in range(REPEATS): + safe_git_restore(target_files) + t = run_hook(hook["config"], target_files) print(f" Run {i + 1}: {t:.3f} seconds") times.append(t) results[hook["name"]] = times return results
135-138
: Preflight config check and discover files after clone; pass into benchmark.Validate configs exist, then discover targets post-clone.
def main(): git_clone() - results = benchmark() + # Verify configs exist + for cfg in (h["config"] for h in HOOKS): + if not Path(cfg).is_file(): + raise SystemExit(f"Missing pre-commit config: {cfg}") + target_files = discover_target_files() + results = benchmark(target_files) report(results)
🧹 Nitpick comments (5)
testing/benchmark_hook_2.yaml (1)
1-5
: Verify rev and normalize YAML style.
- Please confirm that tag v21.1.0 exists for mirrors-clang-format; if not, pin to an existing tag.
- Minor style nits: align list indentation and use a single space after “-”.
Apply for consistency:
-repos: - - repo: https://github.com/pre-commit/mirrors-clang-format - rev: v21.1.0 - hooks: - - id: clang-format +repos: +- repo: https://github.com/pre-commit/mirrors-clang-format + rev: v21.1.0 + hooks: + - id: clang-formattesting/benchmark_hooks.py (4)
1-1
: Shebang vs. execution bit.Either make the file executable (chmod +x) or drop the shebang to silence EXE001.
2-14
: Docstring is out of sync with file paths and targets.Usage path and config filenames don’t match the code; target description is outdated.
-Usage: - python benchmark_hooks.py +Usage: + python3 testing/benchmark_hooks.py @@ -Requirements: -- pre-commit must be installed and available in PATH -- Two config files: - - testing/pre-commit-config-cpp-linter-hooks.yaml - - testing/pre-commit-config-mirrors-clang-format.yaml -- Target files: testing/main.c (or adjust as needed) +Requirements: +- pre-commit installed and on PATH +- Config files: + - testing/benchmark_hook_1.yaml + - testing/benchmark_hook_2.yaml +- Target files: cloned into testing/test-examples/**/*.c by git_clone()
16-20
: Import Path for filesystem checks (used in fixes below).Add pathlib.Path to support config/result path checks.
import subprocess import time import statistics import glob +from pathlib import Path
127-132
: Ensure results directory exists before writing.Prevents failures if testing/ is missing.
- with open(RESULTS_FILE, "w") as f: + Path(RESULTS_FILE).parent.mkdir(parents=True, exist_ok=True) + with open(RESULTS_FILE, "w") as f:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
.gitignore
(1 hunks)testing/benchmark_hook_1.yaml
(1 hunks)testing/benchmark_hook_2.yaml
(1 hunks)testing/benchmark_hooks.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- testing/benchmark_hook_1.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- .gitignore
🧰 Additional context used
🪛 Ruff (0.12.2)
testing/benchmark_hooks.py
1-1: Shebang is present but file is not executable
(EXE001)
42-49: Starting a process with a partial executable path
(S607)
57-57: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
60-60: subprocess
call: check for execution of untrusted input
(S603)
72-72: subprocess
call: check for execution of untrusted input
(S603)
73-73: Starting a process with a partial executable path
(S607)
80-80: subprocess
call: check for execution of untrusted input
(S603)
80-80: Consider ["git", "restore", *tracked]
instead of concatenation
Replace with ["git", "restore", *tracked]
(RUF005)
90-90: Starting a process with a partial executable path
(S607)
102-102: Loop control variable times
not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test (3.14)
- GitHub Check: test (3.12)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (5)
testing/benchmark_hooks.py (5)
33-35
: Discover files recursively, sort for determinism, and fail fast when empty.-# Automatically find all C/C++ files in testing/ (and optionally src/, include/) -TARGET_FILES = glob.glob("testing/test-examples/*.c", recursive=True) +# Automatically find all C/C++ files under the cloned examples (recursive) and stabilize order +TARGET_FILES = sorted( + glob.glob("testing/test-examples/**/*.c", recursive=True) + + glob.glob("testing/test-examples/**/*.cpp", recursive=True) + + glob.glob("testing/test-examples/**/*.h", recursive=True) + + glob.glob("testing/test-examples/**/*.hpp", recursive=True) +) +if not TARGET_FILES: + raise SystemExit("No C/C++ target files found under testing/test-examples/.")
57-66
: Harden subprocess call; prevent hangs and simplify args.def run_hook(config, files): - cmd = ["pre-commit", "run", "--config", config, "--files"] + files + cmd = ["pre-commit", "run", "--config", config, "--files", *files] start = time.perf_counter() try: - subprocess.run(cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) + subprocess.run(cmd, check=True, capture_output=True, timeout=300) except subprocess.CalledProcessError: # Still record time even if hook fails pass + except subprocess.TimeoutExpired: + # Record time even if a hook hangs/times out + pass end = time.perf_counter() return end - start
69-82
: Restore the nested cloned repo, not the top-level worktree.The target files live in testing/test-examples (a separate Git repo), so the current restore is a no-op.
-def safe_git_restore(files): - # Only restore files tracked by git - tracked = [] - for f in files: - result = subprocess.run( - ["git", "ls-files", "--error-unmatch", f], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, - ) - if result.returncode == 0: - tracked.append(f) - if tracked: - subprocess.run(["git", "restore"] + tracked) +def safe_git_restore(_files): + """Reset the cloned examples repo so each run starts from the same state.""" + examples_repo = Path("testing/test-examples") + if (examples_repo / ".git").exists(): + subprocess.run( + ["git", "-C", str(examples_repo), "reset", "--hard", "HEAD"], + check=False, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(examples_repo), "clean", "-fd"], + check=False, + capture_output=True, + )
84-96
: Don’t clean caches each repeat; warm once per hook and then time runs.def benchmark(): results = {} for hook in HOOKS: times = [] print(f"\nBenchmarking {hook['name']}...") - for i in range(REPEATS): - safe_git_restore(TARGET_FILES) - subprocess.run(["pre-commit", "clean"]) - t = run_hook(hook["config"], TARGET_FILES) + # Ensure no cross-hook cache effects, but keep caches across repeats + subprocess.run(["pre-commit", "clean"], check=False) + # Warm caches once (untimed) so subsequent runs measure hook execution, not installation + _ = run_hook(hook["config"], TARGET_FILES) + for i in range(REPEATS): + safe_git_restore(TARGET_FILES) + t = run_hook(hook["config"], TARGET_FILES) print(f" Run {i + 1}: {t:.3f} seconds") times.append(t) results[hook["name"]] = times return results
147-150
: Preflight: verify config files exist before running.def main(): - git_clone() - results = benchmark() + # Preflight: ensure referenced configs exist + for cfg in (h["config"] for h in HOOKS): + if not Path(cfg).is_file(): + raise SystemExit(f"Missing pre-commit config: {cfg}") + git_clone() + results = benchmark() report(results)
🧹 Nitpick comments (4)
testing/benchmark_hooks.py (4)
16-21
: Import Path for filesystem checks used below.import os import subprocess import time import statistics import glob +from pathlib import Path
40-55
: Skip clone if repo already present.Avoids noisy failures and speeds reruns.
def git_clone(): - try: - subprocess.run( + # Clone once; skip if already present + if (Path("testing/test-examples") / ".git").exists(): + return + try: + subprocess.run( [ "git", "clone", "--depth", "1", "https://github.com/gouravthakur39/beginners-C-program-examples.git", "testing/test-examples", ], check=True, ) except subprocess.CalledProcessError: pass
99-105
: Tighten loop to avoid unused variable warning.- for name, times in results.items(): + for name in results: col_widths[0] = max(col_widths[0], len(name))
1-1
: Shebang not needed (script is invoked via python).You can remove the shebang or mark the file executable. Current workflow calls it with python directly.
-#!/usr/bin/env python3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
.github/workflows/benchmark.yml
(1 hunks).pre-commit-config.yaml
(1 hunks)testing/benchmark_hooks.py
(1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.7)
.github/workflows/benchmark.yml
11-11: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
14-14: the runner of "actions/setup-python@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 Ruff (0.12.2)
testing/benchmark_hooks.py
1-1: Shebang is present but file is not executable
(EXE001)
43-50: Starting a process with a partial executable path
(S607)
58-58: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
61-61: subprocess
call: check for execution of untrusted input
(S603)
73-73: subprocess
call: check for execution of untrusted input
(S603)
74-74: Starting a process with a partial executable path
(S607)
81-81: subprocess
call: check for execution of untrusted input
(S603)
81-81: Consider ["git", "restore", *tracked]
instead of concatenation
Replace with ["git", "restore", *tracked]
(RUF005)
91-91: Starting a process with a partial executable path
(S607)
103-103: Loop control variable times
not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.14)
🔇 Additional comments (1)
.pre-commit-config.yaml (1)
15-17
: Ruff --fix is fine; verify CI behavior.Enabling auto-fixes can cause pre-commit to modify files and exit non-zero on the first pass in CI (e.g., pre-commit run --all-files). Ensure your CI either commits fixes (pre-commit.ci) or treats modifications as a failure intentionally.
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update Actions to supported versions; bump Python.
actionlint flags v2 runners as too old. Upgrade to current majors and avoid EOL Python 3.8.
Apply this diff:
- - name: Checkout code
- uses: actions/checkout@v2
+ - name: Checkout code
+ uses: actions/checkout@v4
@@
- - name: Set up Python
- uses: actions/setup-python@v2
+ - name: Set up Python
+ uses: actions/setup-python@v5
with:
- python-version: '3.8'
+ python-version: '3.11'
Also applies to: 14-16
🧰 Tools
🪛 actionlint (1.7.7)
11-11: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🤖 Prompt for AI Agents
In .github/workflows/benchmark.yml around lines 11 and 14-16, update deprecated
workflow action versions and the Python runner: replace uses:
actions/checkout@v2 with the current major (actions/checkout@v4) and update any
other actions pinned to v2/v3 to their current majors; also change the Python
version from 3.8 to a supported release (e.g., 3.11 or 3.10) in the jobs/steps
that set up Python. Ensure the workflow uses the new action majors consistently
and verify no breaking input changes are required by consulting each action's
release notes before committing.
Usage: | ||
python benchmark_hooks.py | ||
|
||
Requirements: | ||
- pre-commit must be installed and available in PATH | ||
- Two config files: | ||
- testing/pre-commit-config-cpp-linter-hooks.yaml | ||
- testing/pre-commit-config-mirrors-clang-format.yaml | ||
- Target files: testing/main.c (or adjust as needed) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix docstring: wrong paths and usage.
The guidance doesn’t match the actual file path and config filenames.
-Usage:
- python benchmark_hooks.py
+Usage:
+ python testing/benchmark_hooks.py
@@
-- Two config files:
- - testing/pre-commit-config-cpp-linter-hooks.yaml
- - testing/pre-commit-config-mirrors-clang-format.yaml
-- Target files: testing/main.c (or adjust as needed)
+- Two config files:
+ - testing/benchmark_hook_1.yaml
+ - testing/benchmark_hook_2.yaml
+- Target files are auto-discovered under testing/test-examples/ after cloning.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In testing/benchmark_hooks.py around lines 5 to 14, the module docstring
contains incorrect usage text and wrong config file paths/names; update the
docstring to show the actual invocation to run this script (reflect whether it
should be run as python testing/benchmark_hooks.py or python -m
testing.benchmark_hooks), replace the two listed config filenames and their
paths with the real config files present in the repo, and correct the example
target file path (or note it is adjustable) so the docstring accurately matches
the repository layout and how to run the script.
Summary by CodeRabbit
New Features
Documentation
Chores