Skip to content

Conversation

@zhenghao104
Copy link
Contributor

@zhenghao104 zhenghao104 commented Feb 3, 2026

MavenWithFallbackDetector

Overview

The MavenWithFallbackDetector is an experimental Maven detector that combines Maven CLI detection with static pom.xml parsing fallback. It provides resilient Maven dependency detection even when the Maven CLI fails (e.g., due to authentication errors with private repositories).

Key Features

  • Primary detection: Uses Maven CLI (mvn dependency:tree) for full transitive dependency resolution
  • Automatic fallback: Falls back to static pom.xml parsing when Maven CLI fails
  • Partial failure handling: Can use Maven CLI results for successful projects while falling back to static parsing for failed ones
  • Authentication error detection: Identifies and reports authentication failures with guidance
  • Nested pom.xml filtering: Optimizes multi-module projects by only processing root pom.xml files with Maven CLI
  • Telemetry: Reports detection method, fallback reasons, and component counts
  • Timeout protection: 5-minute internal timeout as a safety guardrail for the prepare detection phase

Detection Flow

┌─────────────────────────────────────────────────────────────────────────┐
│                        OnPrepareDetectionAsync                          │
│              (wrapped with 5-minute internal timeout)                   │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
                    ┌───────────────────────────────┐
                    │ Check CD_MAVEN_DISABLE_CLI    │
                    │ environment variable          │
                    └───────────────────────────────┘
                                    │
                    ┌───────────────┴───────────────┐
                    │                               │
                    ▼                               ▼
            [Disabled=true]                 [false/unset]
                    │                               │
                    ▼                               ▼
        ┌───────────────────┐           ┌───────────────────────┐
        │ Use Static Parser │           │ Check Maven CLI exists │
        │ for ALL pom.xml   │           └───────────────────────┘
        └───────────────────┘                       │
                                    ┌───────────────┴───────────────┐
                                    │                               │
                                    ▼                               ▼
                            [CLI not found]                 [CLI available]
                                    │                               │
                                    ▼                               ▼
                        ┌───────────────────┐       ┌───────────────────────────┐
                        │ Use Static Parser │       │ RemoveNestedPomXmls       │
                        │ for ALL pom.xml   │       │ (filter to root POMs only)│
                        └───────────────────┘       └───────────────────────────┘
                                                                │
                                                                ▼
                                                ┌───────────────────────────────┐
                                                │ Run mvn dependency:tree       │
                                                │ for each root pom.xml         │
                                                │ (sequentially via ActionBlock)│
                                                └───────────────────────────────┘
                                                                │
                                                                ▼
                                                ┌───────────────────────────────┐
                                                │ Read bcde.mvndeps files       │
                                                │ immediately after generation  │
                                                └───────────────────────────────┘
                                                                │
                                                                ▼
                                                ┌───────────────────────────────┐
                                                │ Check results per directory:  │
                                                │ - Success: deps file present  │
                                                │ - Failure: deps file missing  │
                                                └───────────────────────────────┘
                                                                │
                                ┌───────────────────────┬───────┴───────────────┐
                                │                       │                       │
                                ▼                       ▼                       ▼
                        [All succeeded]         [All failed]            [Partial]
                                │                       │                       │
                                ▼                       ▼                       ▼
                    ┌─────────────────┐     ┌─────────────────────┐ ┌─────────────────────┐
                    │ Return          │     │ AnalyzeMvnCliFailure│ │ AnalyzeMvnCliFailure│
                    │ bcde.mvndeps    │     │ Return ALL pom.xml  │ │ Return bcde.mvndeps │
                    │ only            │     │ in failed dirs for  │ │ + pom.xml in failed │
                    │                 │     │ static parsing      │ │ dirs for static     │
                    │ Method: MvnCli  │     │ Method: StaticOnly  │ │ Method: Mixed       │
                    └─────────────────┘     └─────────────────────┘ └─────────────────────┘

Core Components

1. Environment Variable Control

internal const string DisableMvnCliEnvVar = "CD_MAVEN_DISABLE_CLI";

CD_MAVEN_DISABLE_CLI:

  • Set to true to bypass Maven CLI entirely and use only static parsing

Useful when:

  • Maven CLI is known to fail in the environment
  • Faster scans are preferred (at the cost of transitive dependencies)
  • Debugging static parser behavior

2. Internal Timeout Protection

The detector has an internal 5-minute timeout (PrepareDetectionTimeout) as a safety guardrail:

private static readonly TimeSpan PrepareDetectionTimeout = TimeSpan.FromMinutes(5);

Why? The OnPrepareDetectionAsync phase doesn't have the same guardrails as OnFileFoundAsync. This timeout prevents hangs in the experimental detector. If the timeout is reached:

  1. The operation is cancelled
  2. Telemetry is recorded (TimedOut = true)
  3. Falls back to static pom.xml parsing

3. Cancellation Token Propagation

The detector properly propagates cancellation tokens through the entire call chain, combining the framework's cancellation token with the internal timeout:

┌───────────────────────────────────────────────────────────────────────────┐
│ Orchestrator (4-min experimental detector timeout CancellationToken)      │
└─────────────────────────────┬─────────────────────────────────────────────┘
                              │
                              ▼
┌───────────────────────────────────────────────────────────────────────────┐
│ MavenWithFallbackDetector.OnPrepareDetectionAsync                         │
│   └── CreateLinkedTokenSource(cancellationToken, 5-min internal timeout)  │
│       (Combines framework token with internal timeout - first wins)       │
└─────────────────────────────┬─────────────────────────────────────────────┘
                              │
                              ▼
┌───────────────────────────────────────────────────────────────────────────┐
│ MavenCommandService.GenerateDependenciesFileAsync()                       │
│   └── CancellationTokenSource.CreateLinkedTokenSource(cancellationToken)  │
│       (Links with optional file-level timeout from env var)               │
└─────────────────────────────┬─────────────────────────────────────────────┘
                              │
                              ▼
┌───────────────────────────────────────────────────────────────────────────┐
│ CommandLineInvocationService.ExecuteCommandAsync(cancellationToken)       │
│   └── cancellationToken.Register(() => process.Kill())                    │
│       (Kills Maven process if any token is cancelled)                     │
└───────────────────────────────────────────────────────────────────────────┘

Key points:

  • The 4-minute detector timeout from the orchestrator is combined with the 5-minute internal timeout
  • Whichever triggers first will cancel the operation
  • If cancelled, the Maven process is killed immediately via Process.Kill()
  • Additional file-level timeouts can be configured via MvnCLIFileLevelTimeoutSeconds environment variable

4. Sequential Maven CLI Execution

Maven CLI invocations are processed sequentially using ActionBlock with default settings:

// Process pom.xml files sequentially to match MvnCliComponentDetector behavior.
// Sequential execution avoids Maven local repository lock contention and
// reduces memory pressure from concurrent Maven JVM processes.
var processPomFile = new ActionBlock<ProcessRequest>(async processRequest => { ... });

Why sequential execution?

  • Avoids Maven local repository lock contention when multiple mvn processes access ~/.m2/repository simultaneously
  • Reduces memory pressure from concurrent Maven JVM processes (each can consume 256MB-1GB+)
  • Matches the behavior of MvnCliComponentDetector for consistent performance

5. RemoveNestedPomXmls

Filters pom.xml files to keep only root-level ones for Maven CLI processing.

Why? In multi-module Maven projects:

/project/
  pom.xml              ← ROOT (parent pom) - KEEP
  /module-a/
    pom.xml            ← NESTED (child) - SKIP
  /module-b/
    pom.xml            ← NESTED (child) - SKIP

Running mvn dependency:tree on the root pom automatically processes all child modules. Processing each nested pom separately would be redundant and slow.

Algorithm:

  1. Collect all pom.xml files and their directories
  2. Sort directories by path length (shortest first)
  3. For each pom.xml, check if any shorter path is a parent directory
  4. If a parent directory contains a pom.xml → skip this file (it's nested)
  5. If no parent pom.xml found → keep this file (it's a root)

6. Maven CLI Execution

Uses IMavenCommandService.GenerateDependenciesFileAsync() to run:

mvn dependency:tree -B -DoutputFile=bcde.mvndeps -DoutputType=text -f<pom.xml>

This generates a bcde.mvndeps file containing the full dependency tree.

Immediate file reading: Unlike MvnCliComponentDetector which scans for deps files after all CLI executions, this detector reads the deps file content immediately after each successful CLI execution. This enables per-file success/failure tracking.

7. Failure Detection

After each Maven CLI execution, the detector checks the result:

  • CLI succeeded AND deps file exists with content → Success, add deps file to results
  • CLI succeeded BUT deps file missing/empty → Treat as failure, log warning
  • CLI failed → Failure, capture error output for analysis

8. Static Parsing Fallback with Directory Optimization

When Maven CLI fails for a directory, the detector scans for all pom.xml files in that directory and its subdirectories:

private IEnumerable<ProcessRequest> GetAllPomFilesInDirectories(HashSet<string> directories)
{
    // Normalize directories once for efficient lookup
    var normalizedDirs = directories
        .Select(d => d.TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar) + Path.DirectorySeparatorChar)
        .ToList();

    return this.ComponentStreamEnumerableFactory
        .GetComponentStreams(...)
        .Where(componentStream =>
        {
            var normalizedFileDir = fileDir.TrimEnd(...) + Path.DirectorySeparatorChar;
            // Pre-normalized comparison for efficiency
            return normalizedDirs.Any(fd =>
                normalizedFileDir.Equals(fd, StringComparison.OrdinalIgnoreCase) ||
                normalizedFileDir.StartsWith(fd, StringComparison.OrdinalIgnoreCase));
        })
        ...
}

Performance optimization: Directory paths are normalized once upfront, then reused for all file comparisons, avoiding redundant string operations.

9. Static Parsing Limitations

When falling back to static parsing, dependencies are extracted directly from pom.xml:

<dependencies>
    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-lang3</artifactId>
        <version>3.12.0</version>
    </dependency>
</dependencies>

Limitations of static parsing:

  • Only detects direct dependencies (no transitive)
  • Cannot resolve version ranges (e.g., [1.0,2.0))
  • May not resolve all property references (e.g., ${project.version})

10. Authentication Error Analysis

When Maven CLI fails, the detector analyzes error patterns:

private static readonly string[] AuthErrorPatterns =
[
    "401",
    "403",
    "Unauthorized",
    "Access denied",
];

If authentication errors are detected:

  1. Sets fallbackReason = MavenFallbackReason.AuthenticationFailure
  2. Extracts failed endpoint URLs from error messages using regex
  3. Logs guidance for resolving authentication issues
  4. Records failed endpoints in telemetry (up to 10)

Detection Methods

Method When Used What's Detected
MvnCliOnly Maven CLI succeeds for all pom.xml files Full transitive dependencies
StaticParserOnly Maven CLI disabled, unavailable, or fails completely Direct dependencies only
Mixed Maven CLI succeeds for some, fails for others Transitive (where CLI worked) + Direct (where CLI failed)

Fallback Reasons

Reason Description
None No fallback needed (MvnCli succeeded)
MvnCliDisabledByUser User set CD_MAVEN_DISABLE_CLI=true
MavenCliNotAvailable mvn command not found in PATH
AuthenticationFailure Maven CLI failed with 401/403 errors
OtherMvnCliFailure Maven CLI failed for other reasons (including timeout)

Telemetry

The detector records the following telemetry:

Key Description
DetectionMethod MvnCliOnly, StaticParserOnly, or Mixed
FallbackReason Why fallback occurred (if applicable)
MvnCliComponentCount Components detected via Maven CLI
StaticParserComponentCount Components detected via static parsing
TotalComponentCount Total components detected
MavenCliAvailable Whether Maven CLI was found
OriginalPomFileCount Number of root pom.xml files processed
FailedEndpoints URLs of repositories that had auth failures (up to 10)
TimedOut true if internal 5-minute timeout was reached
PrepareDetectionError Exception type if unexpected error occurred

Version Property Resolution

Static parsing supports resolving version properties:

<properties>
    <commons.version>3.12.0</commons.version>
</properties>
<dependencies>
    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-lang3</artifactId>
        <version>${commons.version}</version>  <!-- Resolved to 3.12.0 -->
    </dependency>
</dependencies>

The resolver:

  1. Checks <properties> section in current document
  2. Checks other loaded pom.xml documents (for multi-module projects)
  3. If unresolved, skips the dependency with a debug log

Usage

As Part of Component Detection

The detector is registered as an experimental detector (IExperimentalDetector) and runs automatically when Component Detection scans a directory containing pom.xml files.

Parallel Execution with MvnCliComponentDetector

Note: This detector uses the same output filename (bcde.mvndeps) and Maven local repository as MvnCliComponentDetector. If both detectors run in parallel on the same pom.xml files, they may interfere with each other. The detector is intended to replace MvnCliComponentDetector, not run alongside it.

Disabling Maven CLI (Static Parsing Only)

Set the environment variable to use only static parsing:

# Bash
export CD_MAVEN_DISABLE_CLI=true

# PowerShell
$env:CD_MAVEN_DISABLE_CLI = "true"

# Azure DevOps Pipeline
variables:
  CD_MAVEN_DISABLE_CLI: 'true'

Experiment Configuration

The detector is paired with MavenWithFallbackExperiment which compares results against the standard MvnCliComponentDetector:

  • Control group: MvnCliComponentDetector
  • Experiment group: MavenWithFallbackDetector

Enable experiments via:

export CD_DETECTOR_EXPERIMENTS=true

Performance Characteristics

Comparison with Existing Detectors

Aspect MavenComponentDetector MvnCliComponentDetector MavenWithFallbackDetector
Approach Static XML parsing Maven CLI only CLI + static fallback
Speed ⚡ Fastest 🐌 Slowest 🔄 Variable
Accuracy Direct deps only Full transitive Best effort
Requirements None Maven CLI + network None (graceful degradation)
Failure handling N/A No fallback Automatic fallback
Parallelism Per-file Sequential Sequential

Performance Optimizations

  1. Sequential CLI execution: Uses ActionBlock with default settings (sequential) to avoid Maven repository lock contention
  2. Directory normalization: Pre-normalizes directory paths once for efficient lookup during fallback scanning
  3. Immediate file reading: Reads deps file content immediately after CLI success, avoiding a second directory scan

Unit Tests

The following unit tests are implemented in MavenWithFallbackDetectorTests.cs:

Maven CLI Scenarios

Test Description
WhenMavenCliNotAvailable_FallsBackToStaticParsing_Async Verifies fallback to static parsing when Maven CLI is not installed
WhenMavenCliNotAvailable_DetectsMultipleDependencies_Async Verifies static parsing detects multiple dependencies correctly
WhenMavenCliSucceeds_UsesMvnCliResults_Async Verifies Maven CLI results are used when available
WhenMavenCliSucceeds_PreservesTransitiveDependencies_Async Verifies transitive dependency relationships are preserved
WhenMavenCliProducesNoOutput_FallsBackToStaticParsing_Async Verifies fallback when Maven CLI runs but produces no output

Static Parser Edge Cases

Test Description
StaticParser_IgnoresDependenciesWithoutVersion_Async Verifies dependencies without version tags are skipped
StaticParser_IgnoresDependenciesWithVersionRanges_Async Verifies version ranges (e.g., [1.0,2.0)) are skipped
StaticParser_ResolvesPropertyVersions_Async Verifies ${property} versions are resolved from <properties>
StaticParser_IgnoresDependenciesWithUnresolvablePropertyVersions_Async Verifies unresolvable properties are skipped

Empty/Missing File Scenarios

Test Description
WhenNoPomXmlFiles_ReturnsSuccessWithNoComponents_Async Verifies graceful handling of no pom.xml files
WhenPomXmlHasNoDependencies_ReturnsSuccessWithNoComponents_Async Verifies handling of pom.xml with no dependencies

Environment Variable Control

Test Description
WhenDisableMvnCliTrue_UsesStaticParsing_Async Verifies CD_MAVEN_DISABLE_CLI=true bypasses Maven CLI entirely
WhenDisableMvnCliEnvVarIsFalse_UsesMvnCliNormally_Async Verifies CD_MAVEN_DISABLE_CLI=false allows Maven CLI to run
WhenDisableMvnCliEnvVarNotSet_UsesMvnCliNormally_Async Verifies Maven CLI runs when env var is not set
WhenDisableMvnCliEnvVarSetToInvalidValue_UsesMvnCliNormally_Async Verifies Maven CLI runs when env var has invalid value

Nested Pom.xml Handling

Test Description
WhenMvnCliSucceeds_NestedPomXmlsAreFilteredOut_Async Verifies nested pom.xml files are filtered when Maven CLI is used
WhenMvnCliFailsCompletely_AllNestedPomXmlsAreRestoredForStaticParsing_Async Verifies all nested pom.xml files are restored when Maven CLI fails completely
WhenMvnCliPartiallyFails_NestedPomXmlsRestoredOnlyForFailedDirectories_Async Verifies nested pom.xml files are restored only for failed directories

Error Analysis and Telemetry

Test Description
WhenMvnCliFailsWithAuthError_LogsFailedEndpointAndSetsTelemetry_Async Verifies auth errors (401/403) are detected and telemetry is set
WhenMvnCliFailsWithNonAuthError_SetsFallbackReasonToOther_Async Verifies non-auth errors set telemetry to OtherMvnCliFailure

File Structure

src/Microsoft.ComponentDetection.Detectors/maven/
├── MavenWithFallbackDetector.cs      # Main detector implementation
├── MavenWithFallbackDetector.md      # This documentation
├── MvnCliComponentDetector.cs        # Standard Maven CLI detector
├── MavenComponentDetector.cs         # Static pom.xml parser detector
├── MavenCommandService.cs            # Maven CLI execution service
├── IMavenCommandService.cs           # Service interface
└── ...

src/Microsoft.ComponentDetection.Orchestrator/Experiments/Configs/
└── MavenWithFallbackExperiment.cs    # Experiment configuration

test/Microsoft.ComponentDetection.Detectors.Tests/
└── MavenWithFallbackDetectorTests.cs # Unit tests

@codecov
Copy link

codecov bot commented Feb 3, 2026

Codecov Report

❌ Patch coverage is 96.75793% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.7%. Comparing base (6748194) to head (27ccd95).

Files with missing lines Patch % Lines
...ction.Detectors/maven/MavenWithFallbackDetector.cs 91.7% 32 Missing and 7 partials ⚠️
....Detectors.Tests/MavenWithFallbackDetectorTests.cs 99.1% 0 Missing and 6 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1628     +/-   ##
=======================================
+ Coverage   90.4%   90.7%   +0.2%     
=======================================
  Files        441     444      +3     
  Lines      38314   39685   +1371     
  Branches    2347    2414     +67     
=======================================
+ Hits       34674   36006   +1332     
- Misses      3159    3187     +28     
- Partials     481     492     +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces MavenWithFallbackDetector, an experimental Maven detector that provides resilient dependency detection by combining Maven CLI execution with static pom.xml parsing fallback. The detector automatically falls back to static parsing when Maven CLI fails (e.g., due to authentication errors or missing CLI), ensuring components are still detected even in challenging build environments.

Changes:

  • Added MavenWithFallbackDetector with dual detection strategy (Maven CLI primary, static parsing fallback) and comprehensive telemetry
  • Registered new experiment configuration comparing the new detector against the standard MvnCliComponentDetector
  • Added 13 comprehensive unit tests covering CLI scenarios, static parser edge cases, environment variable control, and nested pom.xml handling

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/Microsoft.ComponentDetection.Detectors/maven/MavenWithFallbackDetector.cs Main detector implementation with Maven CLI detection, static XML parsing fallback, nested pom filtering, and telemetry tracking
src/Microsoft.ComponentDetection.Orchestrator/Experiments/Configs/MavenWithFallbackExperiment.cs Experiment configuration to compare new detector against existing MvnCliComponentDetector
src/Microsoft.ComponentDetection.Orchestrator/Extensions/ServiceCollectionExtensions.cs Registered new detector and experiment in DI container
test/Microsoft.ComponentDetection.Detectors.Tests/MavenWithFallbackDetectorTests.cs Comprehensive test suite with 13 tests covering all detection scenarios and edge cases

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

👋 Hi! It looks like you modified some files in the Detectors folder.
You may need to bump the detector versions if any of the following scenarios apply:

  • The detector detects more or fewer components than before
  • The detector generates different parent/child graph relationships than before
  • The detector generates different devDependencies values than before

If none of the above scenarios apply, feel free to ignore this comment 🙂

Copilot AI review requested due to automatic review settings February 4, 2026 00:17
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Copilot AI review requested due to automatic review settings February 4, 2026 01:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Copilot AI review requested due to automatic review settings February 4, 2026 21:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

"Could not authenticate",
"Access denied",
"Forbidden",
"status code: 401",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't this already covered by line 106 and 107?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimmed it to

private static readonly string[] AuthErrorPatterns =
[
"401",
"403",
"Unauthorized",
"Access denied",
];

Copilot AI review requested due to automatic review settings February 5, 2026 20:15
@zhenghao104 zhenghao104 requested a review from Copilot February 9, 2026 19:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Comment on lines 115 to +118
services.AddSingleton<IMavenCommandService, MavenCommandService>();
services.AddSingleton<IMavenStyleDependencyGraphParserService, MavenStyleDependencyGraphParserService>();
services.AddSingleton<IComponentDetector, MvnCliComponentDetector>();
services.AddSingleton<IComponentDetector, MavenWithFallbackDetector>();
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registering MavenWithFallbackDetector alongside MvnCliComponentDetector means both detectors will run concurrently during a normal scan (experimental detectors still execute even when their results are discarded). Since both detectors operate on pom.xml and generate/delete the same bcde.mvndeps output, this can create file races and corrupt the control detector’s results even when experiments are disabled. Consider gating registration behind the experiment flag, or ensure the experimental detector uses a distinct output filename/local repo path so it cannot interfere with MvnCliComponentDetector.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the semaphor implementation, this should be ok. The experiment config will also be in CD internal repository.

Comment on lines 283 to 295
var pomFile = processRequest.ComponentStream;
var pomDir = Path.GetDirectoryName(pomFile.Location);
var depsFileName = this.mavenCommandService.BcdeMvnDependencyFileName;
var depsFilePath = Path.Combine(pomDir, depsFileName);

// Generate dependency file using Maven CLI.
// Note: If both MvnCliComponentDetector and this detector are enabled,
// they may run Maven CLI on the same pom.xml independently.
var result = await this.mavenCommandService.GenerateDependenciesFileAsync(
processRequest,
depsFileName,
cancellationToken);

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depsFileName is set to BcdeMvnDependencyFileName (currently bcde.mvndeps) and later ProcessMvnCliResult deletes processRequest.ComponentStream.Location. If MvnCliComponentDetector is also running (it is registered), the two detectors will overwrite/delete each other’s deps files and break detection. Use a detector-specific output filename (and ideally a detector-specific local repository path via the new overload) and only delete files that this detector uniquely owns.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the semaphor implementation, this should be ok. The experiment config will also be in CD internal repository.

Copilot AI review requested due to automatic review settings February 9, 2026 23:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Comment on lines 78 to 83
// Check if another detector already generated the deps file for this location
if (this.completedLocations.TryGetValue(pomFile.Location, out var cachedResult))
{
this.logger.LogDebug("{DetectorPrefix}: Skipping duplicate \"dependency:tree\" for {PomFileLocation}, already generated", DetectorLogPrefix, pomFile.Location);
return cachedResult;
}
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The completedLocations cache is keyed only by pomFile.Location and the cached result is returned without verifying the side effect (that outputFileName was generated and still exists). With parallel detectors deleting bcde.mvndeps, a cached “success” can be returned while the file is already gone, causing downstream callers to incorrectly fall back.

Include outputFileName in the cache key and/or cache the deps file content (or validate file existence before returning cached success).

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added file existence check to ensures that if a previous consumer deleted bcde.mvndeps, the next caller will re-run Maven CLI instead of getting a stale "success" result.

Comment on lines 84 to 87

var result = await this.GenerateDependenciesFileCoreAsync(processRequest, outputFileName, cancellationToken);
this.completedLocations.TryAdd(pomFile.Location, result);
return result;
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completedLocations is described as tracking locations that completed successfully, but the implementation caches all results (including failures) and never evicts entries. Since MavenCommandService is registered as a singleton, this can prevent retries of transient failures in later scans within the same process and can grow memory usage over time.

Only cache successful results and/or add an eviction/clear strategy scoped to a scan run.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to only cache successful result

Copilot AI review requested due to automatic review settings February 10, 2026 00:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Comment on lines +23 to +34
/// <summary>
/// Per-location semaphores to prevent concurrent Maven CLI executions for the same pom.xml.
/// This allows multiple detectors (e.g., MvnCliComponentDetector and MavenWithFallbackDetector)
/// to safely share the same output file without race conditions.
/// </summary>
private readonly ConcurrentDictionary<string, SemaphoreSlim> locationLocks = new();

/// <summary>
/// Tracks locations where dependency generation has completed successfully.
/// Used to skip duplicate executions when multiple detectors process the same pom.xml.
/// </summary>
private readonly ConcurrentDictionary<string, MavenCliResult> completedLocations = new();
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The per-location locationLocks and completedLocations dictionaries grow monotonically and are never pruned, and the SemaphoreSlim instances are never disposed. Since MavenCommandService is registered as a singleton, repeated scans (or very large repos) can cause unbounded memory growth.

Consider using a bounded/expiring cache and removing/disposing per-location semaphores once generation is complete (or switching to a keyed lock implementation that doesn't retain entries indefinitely).

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a clean up method in OnDetectionFinishedAsync to clear cache

Copilot AI review requested due to automatic review settings February 10, 2026 00:18
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment on lines +522 to +531
// Try to delete the deps file
try
{
File.Delete(processRequest.ComponentStream.Location);
}
catch
{
// Ignore deletion errors
}

Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProcessMvnCliResult deletes the generated deps file from disk. If MvnCliComponentDetector is also enabled, it enumerates bcde.mvndeps files only after all CLI runs complete; deleting the file here can make the other detector miss it (especially with the new MavenCommandService “skip duplicate” cache). If these detectors can run together, consider not deleting the file here (let a single owner delete), or ensure the two detectors are mutually exclusive via restrictions/experiments.

Suggested change
// Try to delete the deps file
try
{
File.Delete(processRequest.ComponentStream.Location);
}
catch
{
// Ignore deletion errors
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File.Delete will not throw on non existent file, so this should be ok

Comment on lines +738 to +741
return componentStreams
.ToList()
.SelectMany(allRequests =>
{
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoveNestedPomXmls buffers the entire pom.xml request stream via ToList(), which can add significant memory pressure on repos with many pom.xml files. MvnCliComponentDetector’s RemoveNestedPomXmls implementation streams and avoids materializing everything. Consider reusing the existing streaming approach (or otherwise avoiding ToList()) so OnPrepareDetectionAsync scales better.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FernandoRojo Would this be an issue?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/Microsoft.ComponentDetection.Detectors/maven/MavenCommandService.cs:120

  • GenerateDependenciesFileCoreAsync creates a linked CancellationTokenSource (cliFileTimeout) but never disposes it. This can leak registrations/timers over many invocations. Wrap it in a using (or try/finally with Dispose) so it’s always cleaned up.
        var cliFileTimeout = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
        var timeoutSeconds = -1;
        if (this.envVarService.DoesEnvironmentVariableExist(MvnCLIFileLevelTimeoutSecondsEnvVar)
                    && int.TryParse(this.envVarService.GetEnvironmentVariable(MvnCLIFileLevelTimeoutSecondsEnvVar), out timeoutSeconds)
                    && timeoutSeconds >= 0)
        {
            cliFileTimeout.CancelAfter(TimeSpan.FromSeconds(timeoutSeconds));
            this.logger.LogInformation("{DetectorPrefix}: {TimeoutVar} var was set to {TimeoutSeconds} seconds.", DetectorLogPrefix, MvnCLIFileLevelTimeoutSecondsEnvVar, timeoutSeconds);
        }

Comment on lines 23 to 88
@@ -43,7 +57,57 @@ public async Task<bool> MavenCLIExistsAsync()
return await this.commandLineInvocationService.CanCommandBeLocatedAsync(PrimaryCommand, AdditionalValidCommands, MvnVersionArgument);
}

public async Task GenerateDependenciesFileAsync(ProcessRequest processRequest, CancellationToken cancellationToken = default)
public async Task<MavenCliResult> GenerateDependenciesFileAsync(ProcessRequest processRequest, CancellationToken cancellationToken = default)
{
var pomFile = processRequest.ComponentStream;
var pomDir = Path.GetDirectoryName(pomFile.Location);
var depsFilePath = Path.Combine(pomDir, this.BcdeMvnDependencyFileName);

// Check the cache before acquiring the semaphore to allow fast-path returns
// even when cancellation has been requested.
if (this.completedLocations.TryGetValue(pomFile.Location, out var cachedResult)
&& cachedResult.Success
&& File.Exists(depsFilePath))
{
this.logger.LogDebug("{DetectorPrefix}: Skipping duplicate \"dependency:tree\" for {PomFileLocation}, already generated", DetectorLogPrefix, pomFile.Location);
return cachedResult;
}

// Use semaphore to prevent concurrent Maven CLI executions for the same pom.xml.
// This allows MvnCliComponentDetector and MavenWithFallbackDetector to safely share the output file.
var semaphore = this.locationLocks.GetOrAdd(pomFile.Location, _ => new SemaphoreSlim(1, 1));

await semaphore.WaitAsync(cancellationToken);
try
{
// Re-check the cache after acquiring the semaphore in case another caller
// completed while we were waiting.
if (this.completedLocations.TryGetValue(pomFile.Location, out cachedResult)
&& cachedResult.Success
&& File.Exists(depsFilePath))
{
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The locationLocks/completedLocations dictionaries are keyed by pomFile.Location using the default (case-sensitive) comparer. On Windows/macOS this can lead to duplicate locks/cache entries for the same file path with different casing, causing extra CLI executions and unbounded growth. Consider normalizing to a full path and using StringComparer.OrdinalIgnoreCase for these dictionaries.

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +67
public class MavenWithFallbackDetector : FileComponentDetector, IExperimentalDetector
{
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MavenWithFallbackDetector implements IExperimentalDetector, which the orchestrator still executes by default (results are just discarded unless explicitly enabled). Because this detector runs Maven CLI and touches shared files, running it by default can introduce side effects even when the experiment isn’t enabled. Consider also implementing IDefaultOffComponentDetector (and enabling it only via experiment/DetectorArgs) to avoid unintended execution and side effects.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants