Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4eeb5928a5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| [ | ||
| @"(^|/)(test|tests)/", | ||
| @"[Tt]ests?\.cs$", | ||
| @"[Ss]pec[^/]*\.cs$" |
There was a problem hiding this comment.
Restrict spec-file inclusion regex to real test names
The inclusion regex "[Ss]pec[^/]*\.cs$" matches any filename containing spec, so non-test files like src/Inspector.cs or src/Aspect.cs are treated as test files and can incorrectly trigger DET-TEST-* rules. Because this inclusion list is reused for test-only signals, this introduces systematic false positives in production code paths; anchor the pattern to actual test naming conventions (for example, basename ending in Spec/Specs) instead of substring matching.
Useful? React with 👍 / 👎.
| WhyItMatters: "Plain HTTP transmits data without encryption; an accidental http:// in a configuration or request can silently downgrade a previously TLS-protected call.", | ||
| SuggestedAction: "Replace http:// with https:// or move the URL to configuration and enforce TLS at the transport layer.", | ||
| Kind: DeterministicSignalKind.AddedLineRegex, | ||
| Pattern: @"[""']http://(?!localhost\b|127\.0\.0\.1\b)[^""'\s]{3,}[""']", |
There was a problem hiding this comment.
Narrow localhost exemption in insecure-http detector
The DET-SEC-012 pattern exempts localhost using localhost\b, which also excludes hosts like http://localhost.example.com (the . creates a word boundary). Those are not loopback URLs, so they should still be flagged as insecure http:// usage; with the current regex this class of downgrade is silently missed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR expands GauntletCI’s deterministic change-risk detection (including new added-line and removed-line regex signals) and introduces an experimental benchmark evidence generator that evaluates the curated fixture corpus and exports reproducible metrics as JSON/CSV, with CI automation to produce artifacts.
Changes:
- Extend
DeterministicAnalysisRunnerto support removed-line regex signals, add file-path inclusion filters, and add multiple new deterministic signal specs with corresponding unit tests. - Add
GauntletCI.BenchmarkReporter(and tests) to generate benchmark evidence reports (latest.json/latest.csv) from curated fixtures. - Add a scheduled/manual GitHub Actions workflow to build/run the reporter and upload evidence artifacts; update solution + README to document the evidence system.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/GauntletCI.Core/Evaluation/DeterministicAnalysisRunner.cs |
Adds removed-line signal scanning + new signal specs and filtering logic. |
tests/GauntletCI.Core.Tests/DeterministicAnalysisRunnerTests.cs |
Adds coverage for the newly introduced deterministic signals. |
src/GauntletCI.BenchmarkReporter/Program.cs |
Implements the benchmark reporter CLI, report building, and JSON/CSV formatting. |
src/GauntletCI.BenchmarkReporter/GauntletCI.BenchmarkReporter.csproj |
New reporter project wired to core + test visibility. |
tests/GauntletCI.BenchmarkReporter.Tests/* |
New test project validating CLI option parsing. |
.github/workflows/benchmark-evidence.yml |
Automates report generation and uploads evidence as workflow artifacts. |
docs/benchmarks/latest.json / docs/benchmarks/latest.csv |
Adds example/generated evidence outputs. |
README.md |
Documents the experimental evidence system and local usage. |
GauntletCI.slnx |
Adds new reporter + test projects to the solution. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (line.StartsWith("diff --git ", StringComparison.Ordinal)) | ||
| { | ||
| currentFile = ParseFilePath(line); | ||
| insideHunk = false; | ||
| currentNewLine = 0; | ||
| currentFileExcluded = signal.ExcludedFilePathPatterns.Any(exclusion => exclusion.IsMatch(currentFile)); | ||
| currentTrackingLine = 0; | ||
| currentFileExcluded = signal.ExcludedFilePathPatterns.Any(e => e.IsMatch(currentFile)); | ||
| currentFileIncluded = signal.IncludedFilePathPatterns.Count == 0 || | ||
| signal.IncludedFilePathPatterns.Any(e => e.IsMatch(currentFile)); | ||
| continue; |
There was a problem hiding this comment.
ParseFilePath relies on splitting the diff --git header on spaces and taking parts[3]. Git diffs can contain quoted/escaped paths with spaces, which will break this parsing and cause findings to be attributed to unknown (and inclusion/exclusion filters to misbehave). Consider parsing diff --git headers with a dedicated regex/state machine that supports quoted paths/escaped spaces, or otherwise extracting the b/<path> token robustly.
| case "--repo-root" when i + 1 < args.Length: | ||
| repoRoot = Path.GetFullPath(args[++i]); | ||
| break; | ||
| case "--fixtures-root" when i + 1 < args.Length: | ||
| fixturesRoot = Path.GetFullPath(args[++i]); | ||
| break; | ||
| case "--output-dir" when i + 1 < args.Length: | ||
| outputDirectory = Path.GetFullPath(args[++i]); | ||
| break; |
There was a problem hiding this comment.
--fixtures-root and --output-dir are normalized with Path.GetFullPath(...), which resolves relative paths against the current working directory, not --repo-root. This makes invocations like --repo-root /repo --output-dir docs/benchmarks behave unexpectedly when run outside the repo. Consider resolving relative option values against repoRoot (e.g., only call GetFullPath after combining with repoRoot when the provided path is not already absolute).
| - cron: "0 5 * * 1" | ||
|
|
||
| permissions: | ||
| contents: read |
There was a problem hiding this comment.
The workflow sets an explicit permissions block with only contents: read. When permissions are explicitly set, unspecified scopes default to none, and actions/upload-artifact typically needs actions: write to upload artifacts. Consider adding actions: write (and keeping contents: read) to avoid artifact upload failures under the restricted token.
| contents: read | |
| contents: read | |
| actions: write |
| if (line.StartsWith(otherPrefix) && !line.StartsWith(otherHeaderPrefix, StringComparison.Ordinal)) | ||
| { | ||
| // Other side's line — don't advance our tracking line number. | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Diffs can include marker lines like \ No newline at end of file within hunks. These lines are not real file content, but the current loop treats them as context and increments currentTrackingLine, which can shift reported line numbers for subsequent matches. Consider special-casing these \\-prefixed marker lines so they do not advance the tracked line number.
Extend DeterministicAnalysisRunner with two new capabilities: - RemovedLineRegex signal kind: scans removed diff lines (-) using the same pattern engine, enabling detection of safety regressions. - IncludedFilePathPatterns filter: optional allowlist so signals can be scoped to test files only (e.g. [Ignore]/[Skip], try-catch in tests). New AddedLine signals (17): DET-CONC-003 [ThreadStatic] attribute added (GCI016) DET-DISP-002 GC.SuppressFinalize without Dispose (GCI007) DET-PERF-004 MethodImplOptions.NoInlining added (GCI011) DET-API-004 sealed class/record added (GCI004) DET-TEST-002 [Ignore]/[Skip] in test files (GCI005) DET-TEST-003 try-catch inside test method (GCI005) DET-SEC-009 [SuppressUnmanagedCodeSecurity] added (GCI012) DET-SEC-011 [AllowAnonymous] added (GCI012) DET-SEC-012 http:// URL replacing https:// (GCI012) DET-SEC-013 Process.Start introduced (GCI012) DET-QUAL-001 [SuppressMessage] added (GCI013) DET-QUAL-002 #pragma warning disable added (GCI013) DET-QUAL-004 dynamic keyword introduced (GCI008) DET-QUAL-005 goto statement introduced (GCI008) DET-QUAL-006 [CLSCompliant(false)] added (GCI004) DET-INFRA-002 Timeout.Infinite introduced (GCI016) DET-INFRA-004 Environment.Exit/FailFast introduced (GCI003) New RemovedLine signals (18): DET-CONC-004 lock() removed (GCI016) DET-CONC-005 Parallel.ForEach removed (GCI016) DET-CONC-006 ConcurrentDictionary/Bag/Queue/Stack removed (GCI016) DET-CONC-007 ImmutableArray/List/Dictionary etc. removed (GCI016) DET-CONC-008 volatile keyword removed (GCI016) DET-INFRA-003 CancellationToken removed (GCI016) DET-MUTAT-001 readonly keyword removed (GCI016) DET-REL-004 finally block removed (GCI007) DET-REL-005 retry/Polly policy removed (GCI007) DET-INFRA-001 circuit breaker removed (GCI007) DET-DISP-001 using statement removed (GCI007) DET-PERF-003 ArrayPool usage removed (GCI011) DET-API-002 public member removed (GCI004) DET-API-003 [Obsolete] attribute removed (GCI004) DET-TEST-001 [Fact]/[Test] method removed (GCI005, test files only) DET-SEC-008 [Authorize] attribute removed (GCI012) DET-SEC-010 [SecurityCritical] attribute removed (GCI012) DET-QUAL-003 #nullable enable removed (GCI006) Also fix IsCommentLike to not suppress C# preprocessor directives (#pragma, #nullable, #if, #region, etc.) as comment-like lines. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add GauntletCI.BenchmarkReporter.Tests to GauntletCI.slnx so CI picks up the reporter parser tests (was silently excluded before) - Fix --repo-root and --fixtures-root help text to match actual behavior (walks up from CWD; fixtures default is repo-relative) - Add SPDX license header to BenchmarkReporter/Program.cs for consistency with all other src/ files - Add --no-build to dotnet run in benchmark-evidence.yml to avoid redundant recompilation after the explicit dotnet build step Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4eeb592 to
83de346
Compare
Fix 6 issues flagged by Copilot and Codex reviewers:
1. Spec-file inclusion regex false positives (Codex P2)
[Ss]pec[^/]*\.cs$ matched Inspector.cs, Aspect.cs etc. because
[^/]* allows arbitrary chars between 'spec' and '.cs'. Changed to
(^|/)[^/]*[Ss]pecs?\.cs$ so only basenames ending in Spec/Specs
are treated as test files. Applied to both AddedLine and
RemovedLine spec builders.
2. Localhost exemption too broad in DET-SEC-012 (Codex P2)
localhost\b treated localhost.example.com as loopback because \b
fires on the t/. boundary. Changed to localhost(?![.\w]) so the
exemption only fires when localhost is not followed by a dot or
word character.
3. ParseFilePath breaks on paths with spaces (Copilot)
Replaced space-split heuristic with a regex using a backreference
(diff --git a/(.+) b/\1) that correctly identifies the split point
regardless of spaces in paths. Falls back to LastIndexOf(' b/')
for renames/copies where the two paths differ.
4. \ No newline at end of file advances line counter (Copilot)
Lines starting with \ are git meta-markers, not real content.
Added an early continue so they no longer shift subsequent finding
line numbers.
5. actions/upload-artifact blocked by missing permission (Copilot)
Added actions: write to the benchmark-evidence.yml permissions
block so artifact upload does not fail under the restricted token.
6. --fixtures-root / --output-dir resolve against CWD not repo-root (Copilot)
Deferred Path.GetFullPath resolution until after all args are
parsed, then resolve relative paths against repoRoot instead of
the current working directory.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| !snippet.StartsWith("#else", StringComparison.OrdinalIgnoreCase) && | ||
| !snippet.StartsWith("#endif", StringComparison.OrdinalIgnoreCase) && | ||
| !snippet.StartsWith("#region", StringComparison.OrdinalIgnoreCase) && | ||
| !snippet.StartsWith("#endregion", StringComparison.OrdinalIgnoreCase) && | ||
| !snippet.StartsWith("#define", StringComparison.OrdinalIgnoreCase)) |
There was a problem hiding this comment.
IsCommentLike treats any #... line as a comment unless it matches a short allowlist of C# preprocessor directives. This will incorrectly classify valid directives like #elif, #undef, #error, #warning, and #line as comments, preventing signals from matching those lines and skewing line-number tracking.
Consider either expanding the allowlist to include the full set of common C# directives, or changing the heuristic to only treat # as comment-like for known non-C# file types (since many diffs here are C#).
| !snippet.StartsWith("#else", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#endif", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#region", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#endregion", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#define", StringComparison.OrdinalIgnoreCase)) | |
| !snippet.StartsWith("#elif", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#else", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#endif", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#region", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#endregion", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#define", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#undef", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#error", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#warning", StringComparison.OrdinalIgnoreCase) && | |
| !snippet.StartsWith("#line", StringComparison.OrdinalIgnoreCase)) |
| if (line.StartsWith(linePrefix) && !line.StartsWith(ownHeaderPrefix, StringComparison.Ordinal)) | ||
| { | ||
| string snippet = line[1..].Trim(); | ||
| if (!currentFileExcluded && | ||
| currentFileIncluded && | ||
| !IsCommentLike(snippet) && | ||
| !signal.ExcludedSnippetPatterns.Any(exclusion => exclusion.IsMatch(snippet)) && | ||
| !signal.ExcludedSnippetPatterns.Any(e => e.IsMatch(snippet)) && | ||
| signal.Pattern.IsMatch(snippet)) | ||
| { | ||
| return new AddedLineMatch(currentFile, currentNewLine, snippet.Length > 160 ? snippet[..160] : snippet); | ||
| return new LineMatch(currentFile, currentTrackingLine, snippet.Length > 160 ? snippet[..160] : snippet); | ||
| } | ||
|
|
||
| currentNewLine++; | ||
| currentTrackingLine++; | ||
| continue; | ||
| } | ||
|
|
||
| if (line.StartsWith('-') && !line.StartsWith("---", StringComparison.Ordinal)) | ||
| if (line.StartsWith(otherPrefix) && !line.StartsWith(otherHeaderPrefix, StringComparison.Ordinal)) | ||
| { |
There was a problem hiding this comment.
Header detection for diff file markers is currently !line.StartsWith("+++"/"---"), which will also exclude real added/removed content lines whose content begins with +++ or --- (e.g., ++++i in C#/C++ or string literals starting with those sequences). This can cause missed signal matches.
To avoid false negatives, make the header check more specific (e.g., match "+++ b/" / "--- a/" or "+++ " / "--- " prefixes) instead of any 3-character prefix.
This pull request introduces a new benchmarking and reporting system for GauntletCI, focused on generating, publishing, and automating reproducible benchmark evidence from the real-fixture corpus. It adds a new reporter tool, corresponding tests, workflow automation, and updates documentation and solution structure to support these features.
Benchmarking and Reporting System
GauntletCI.BenchmarkReporterfor generating benchmark evidence reports, including its project file (.csproj) and integration into the main solution (GauntletCI.slnx). [1] [2]GauntletCI.BenchmarkReporter.Testswith initial unit tests for command-line options parsing. [1] [2] [3]Automation and Artifacts
.github/workflows/benchmark-evidence.yml) to automate benchmark evidence generation and artifact upload, supporting both scheduled and manual runs.docs/benchmarks/latest.csvto the repository, demonstrating the format and content of generated benchmark evidence.Documentation
README.mdwith a new section describing the experimental evidence system, instructions for local report generation, artifact formats, and automation details.