Skip to content

Conversation

@JamieMagee
Copy link
Member

@JamieMagee JamieMagee commented Nov 14, 2025

Key changes:

  • Add IArtifactComponentFactory interface with implementations for Linux, npm, and pip components
  • Add IArtifactFilter interface with Mariner2ArtifactFilter to handle distro-specific artifact filtering
  • Refactor LinuxScanner to use factories and filters via dependency injection
  • Rename LinuxComponents to Components in LayerMappedLinuxComponents for generic support
  • Update telemetry to track component types alongside names and versions
  • Add test coverage for multi-type artifact scanning

See my local tests of mcr.microsoft.com/azurelinux/base/nodejs:20 and mcr.microsoft.com/azurelinux/base/python:3 here: https://gist.github.com/JamieMagee/c758b7f8bc37e8c3f3e4e95417b8d87a

@JamieMagee JamieMagee requested a review from a team as a code owner November 14, 2025 00:09
@JamieMagee JamieMagee force-pushed the users/jamagee/application-level-component-containers branch from 06947d5 to 41c23cb Compare November 14, 2025 00:12
@github-actions
Copy link

github-actions bot commented Nov 14, 2025

👋 Hi! It looks like you modified some files in the Detectors folder.
You may need to bump the detector versions if any of the following scenarios apply:

  • The detector detects more or fewer components than before
  • The detector generates different parent/child graph relationships than before
  • The detector generates different devDependencies values than before

If none of the above scenarios apply, feel free to ignore this comment 🙂

Key changes:
- Add IArtifactComponentFactory interface with implementations for Linux, npm, and pip components
- Add IArtifactFilter interface with Mariner2ArtifactFilter to handle distro-specific artifact filtering
- Refactor LinuxScanner to use factories and filters via dependency injection
- Rename LinuxComponents to Components in LayerMappedLinuxComponents for generic support
- Update telemetry to track component types alongside names and versions
- Add test coverage for multi-type artifact scanning
@JamieMagee JamieMagee force-pushed the users/jamagee/application-level-component-containers branch from 41c23cb to e05aa16 Compare November 14, 2025 00:17
@codecov
Copy link

codecov bot commented Nov 14, 2025

Codecov Report

❌ Patch coverage is 76.49123% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.2%. Comparing base (d8c0daa) to head (e05aa16).

Files with missing lines Patch % Lines
...etection.Detectors/linux/LinuxContainerDetector.cs 50.0% 26 Missing and 1 partial ⚠️
...ComponentDetection.Detectors/linux/LinuxScanner.cs 80.3% 9 Missing and 4 partials ⚠️
...n.Detectors/linux/Factories/NpmComponentFactory.cs 62.0% 6 Missing and 5 partials ⚠️
...Detectors/linux/Factories/LinuxComponentFactory.cs 66.6% 4 Missing and 2 partials ⚠️
...n.Detectors/linux/Factories/PipComponentFactory.cs 62.5% 4 Missing and 2 partials ⚠️
....Detectors/linux/Filters/Mariner2ArtifactFilter.cs 88.4% 2 Missing and 1 partial ⚠️
...inux/Exceptions/MissingContainerDetailException.cs 0.0% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1529     +/-   ##
=======================================
- Coverage   90.3%   90.2%   -0.1%     
=======================================
  Files        418     423      +5     
  Lines      35300   35464    +164     
  Branches    2188    2209     +21     
=======================================
+ Hits       31877   32005    +128     
- Misses      2980    3002     +22     
- Partials     443     457     +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot finished reviewing on behalf of JamieMagee November 14, 2025 00:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for detecting application-level components (npm and pip packages) within Linux container images, expanding the LinuxContainerDetector beyond system packages. The implementation introduces a factory pattern to create typed components from Syft artifacts and a filter pattern to handle distribution-specific edge cases.

Key Changes:

  • Introduced IArtifactComponentFactory interface with implementations for Linux (apk/deb/rpm), npm, and pip components
  • Added IArtifactFilter interface with Mariner2ArtifactFilter to handle distro-specific workarounds
  • Refactored LinuxScanner to use dependency injection for factories and filters, replacing hardcoded logic

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/Microsoft.ComponentDetection.Detectors/linux/LinuxScanner.cs Refactored to use factory pattern for component creation and filter pattern for artifact filtering; updated telemetry tracking to support multiple component types
src/Microsoft.ComponentDetection.Detectors/linux/LinuxContainerDetector.cs Updated to support Npm and Pip component types; changed category from Linux to All; improved timeout handling and added documentation
src/Microsoft.ComponentDetection.Detectors/linux/ILinuxScanner.cs Added XML documentation and renamed parameter from dockerLayers to containerLayers for clarity
src/Microsoft.ComponentDetection.Detectors/linux/Factories/IArtifactComponentFactory.cs New interface defining factory contract for creating TypedComponents from Syft artifacts
src/Microsoft.ComponentDetection.Detectors/linux/Factories/ArtifactComponentFactoryBase.cs New abstract base class providing shared logic for extracting license and author information
src/Microsoft.ComponentDetection.Detectors/linux/Factories/LinuxComponentFactory.cs New factory for creating LinuxComponent instances from system packages (apk, deb, rpm)
src/Microsoft.ComponentDetection.Detectors/linux/Factories/NpmComponentFactory.cs New factory for creating NpmComponent instances from npm artifacts
src/Microsoft.ComponentDetection.Detectors/linux/Factories/PipComponentFactory.cs New factory for creating PipComponent instances from Python artifacts
src/Microsoft.ComponentDetection.Detectors/linux/Filters/IArtifactFilter.cs New interface for filtering or transforming Syft artifacts before component creation
src/Microsoft.ComponentDetection.Detectors/linux/Filters/Mariner2ArtifactFilter.cs New filter extracting Mariner 2.0 workaround logic from LinuxScanner into a dedicated, reusable component
src/Microsoft.ComponentDetection.Detectors/linux/Exceptions/MissingContainerDetailException.cs Added XML documentation and fixed string interpolation bug (${imageId} → {imageId})
src/Microsoft.ComponentDetection.Detectors/linux/Contracts/ImageScanningResult.cs Added XML documentation for properties
src/Microsoft.ComponentDetection.Contracts/BcdeModels/LayerMappedLinuxComponents.cs Renamed LinuxComponents property to Components and changed type from IEnumerable<LinuxComponent> to IEnumerable<TypedComponent> to support multiple component types
src/Microsoft.ComponentDetection.Orchestrator/Extensions/ServiceCollectionExtensions.cs Registered new factory and filter implementations in DI container
test/Microsoft.ComponentDetection.Detectors.Tests/LinuxScannerTests.cs Updated existing tests to use new constructor; added comprehensive test for multi-type artifact scanning
test/Microsoft.ComponentDetection.Detectors.Tests/LinuxContainerDetectorTests.cs Updated mock data to use new Components property name


public IEnumerable<string> Categories => [Enum.GetName(typeof(DetectorClass), DetectorClass.Linux)];
/// <inheritdoc/>
public IEnumerable<string> Categories => [Enum.GetName(typeof(DetectorClass), DetectorClass.All)];
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The category has been changed from DetectorClass.Linux to DetectorClass.All. According to the DetectorClass enum documentation, All is described as "Default value, which indicates all classes should be run. Not used as an actual category."

Since this detector now supports multiple component types (Linux, Npm, Pip), consider either:

  1. Using multiple specific categories: [Enum.GetName(typeof(DetectorClass), DetectorClass.Linux), Enum.GetName(typeof(DetectorClass), DetectorClass.Npm), Enum.GetName(typeof(DetectorClass), DetectorClass.Pip)]
  2. Keeping DetectorClass.Linux as the primary category since it's fundamentally a Linux container scanner
  3. If DetectorClass.All is intentional for multi-type detectors, this may warrant updating the enum documentation to clarify that usage pattern
Suggested change
public IEnumerable<string> Categories => [Enum.GetName(typeof(DetectorClass), DetectorClass.All)];
public IEnumerable<string> Categories => [
Enum.GetName(typeof(DetectorClass), DetectorClass.Linux),
Enum.GetName(typeof(DetectorClass), DetectorClass.Npm),
Enum.GetName(typeof(DetectorClass), DetectorClass.Pip)
];

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +197
Name = c.Component switch
{
LinuxComponent lc => lc.Name,
NpmComponent nc => nc.Name,
PipComponent pc => pc.Name,
_ => "Unknown",
},
Version = c.Component switch
{
LinuxComponent lc => lc.Version,
NpmComponent nc => nc.Version,
PipComponent pc => pc.Version,
_ => "Unknown",
},
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch expressions for extracting Name and Version from components are not maintainable. When new component types are added (e.g., MavenComponent, RubyGemComponent), these switch expressions must be manually updated, which is error-prone.

Consider using reflection or a common interface/base class property to extract Name and Version generically. Alternatively, you could use the component's Id property or implement a visitor pattern to handle different component types more extensibly.

Copilot uses AI. Check for mistakes.
foreach (var factory in componentFactories)
{
foreach (var artifactType in factory.SupportedArtifactTypes)
{
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The factory lookup dictionary construction could silently overwrite entries if multiple factories claim to support the same artifact type. While this isn't currently an issue (apk/deb/rpm, npm, and python don't overlap), consider adding validation to detect and report conflicts:

foreach (var factory in componentFactories)
{
    foreach (var artifactType in factory.SupportedArtifactTypes)
    {
        if (this.factoryLookup.ContainsKey(artifactType))
        {
            throw new InvalidOperationException($"Multiple factories registered for artifact type '{artifactType}'");
        }
        this.factoryLookup[artifactType] = factory;
    }
}

This would make the code more robust against future configuration errors.

Suggested change
{
{
if (this.factoryLookup.ContainsKey(artifactType))
{
throw new InvalidOperationException($"Multiple factories registered for artifact type '{artifactType}'");
}

Copilot uses AI. Check for mistakes.
result.Should().ContainSingle();
var package = result.First();
var package = result.First() as LinuxComponent;
package.Name.Should().Be("test");
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable package may be null at this access because of this assignment.

Copilot uses AI. Check for mistakes.
result.Should().ContainSingle();
var package = result.First();
var package = result.First() as LinuxComponent;
package.Name.Should().Be("test");
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable package may be null at this access because of this assignment.

Copilot uses AI. Check for mistakes.
result.Should().ContainSingle();
var package = result.First();
var package = result.First() as LinuxComponent;
package.Name.Should().Be("busybox");
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable package may be null at this access because of this assignment.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants