Skip to content

Conversation

@vpetersson
Copy link
Contributor

When processing requirements.txt files, use pipdeptree to discover transitive dependencies that are not explicitly listed. This creates more complete SBOMs by including the full dependency tree.

Features:

  • New _dependency_expansion module with Protocol + Registry pattern
  • PipdeptreeExpander discovers transitives from installed packages
  • Step 1.4 in CLI pipeline between generation and hash enrichment
  • Audit trail records each discovered dependency with parent info
  • Graceful skip when packages not installed in environment

Closes #161

When processing requirements.txt files, use pipdeptree to discover
transitive dependencies that are not explicitly listed. This creates
more complete SBOMs by including the full dependency tree.

Features:
- New _dependency_expansion module with Protocol + Registry pattern
- PipdeptreeExpander discovers transitives from installed packages
- Step 1.4 in CLI pipeline between generation and hash enrichment
- Audit trail records each discovered dependency with parent info
- Graceful skip when packages not installed in environment

Closes #161

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 3, 2026 13:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds pipdeptree integration to discover and include transitive dependencies that are not explicitly listed in requirements.txt files, creating more complete SBOMs.

Changes:

  • Implements a Protocol + Registry pattern for dependency expansion with PipdeptreeExpander as the first implementation
  • Adds Step 1.4 to the CLI pipeline between SBOM generation and hash enrichment
  • Records discovered dependencies in the audit trail with parent package information

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/test_dependency_expansion.py Comprehensive test suite for dependency expansion subsystem
sbomify_action/cli/main.py Integrates Step 1.4 transitive dependency discovery into pipeline
sbomify_action/_dependency_expansion/registry.py Registry for managing dependency expander instances
sbomify_action/_dependency_expansion/protocol.py Protocol definition for dependency expander plugins
sbomify_action/_dependency_expansion/models.py Data models for discovered dependencies and expansion results
sbomify_action/_dependency_expansion/expanders/pipdeptree.py pipdeptree-based expander for Python dependencies
sbomify_action/_dependency_expansion/expanders/init.py Expander implementations module exports
sbomify_action/_dependency_expansion/enricher.py Orchestrates dependency expansion and SBOM enrichment
sbomify_action/_dependency_expansion/init.py Module initialization and public API exports
pyproject.toml Adds pipdeptree>=2.0.0 dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Make ExpansionResult.source a required field (no default value)
- Add supports_dependency_expansion() function using registry pattern
- Replace hardcoded requirements.txt check with registry-based lookup
- Update test to explicitly pass source parameter

Refs #163

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vpetersson vpetersson requested a review from Copilot February 4, 2026 07:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Change step_num type hints from int to int | float in console.py
  and main.py to accurately reflect substep usage (1.4, 1.5)
- Add comment documenting the substep pattern (1.x = substeps of Step 1)

Refs #163

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vpetersson vpetersson requested a review from Copilot February 4, 2026 08:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vpetersson and others added 2 commits February 4, 2026 10:07
- Add comment explaining pipdeptree package_name vs key fallback
- Document SPDX ID format assumptions and collision handling
- Rename 'seen' to 'seen_package_versions' in test for clarity

Refs #163

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
We require pipdeptree>=2.0.0, so just use the current field name
(package_name) without fallback to older versions.

Refs #163

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 4, 2026 09:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Simplify version extraction condition (split always produces 2 parts)
- Use hyphens instead of stripping separators for SPDX IDs
- Add assertion to verify SBOM is unmodified when expansion skipped

Refs #163

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vpetersson vpetersson requested a review from Copilot February 11, 2026 19:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +121 to +155
bom = Bom.from_json(sbom_data)
original_count = len(bom.components) if bom.components else 0

# Build set of existing PURLs for deduplication
existing_purls: set[str] = set()
if bom.components:
for comp in bom.components:
if comp.purl:
existing_purls.add(str(comp.purl).lower())

# Add new components
added_count = 0
audit = get_audit_trail()

for dep in discovered:
# Skip if already exists
if dep.purl.lower() in existing_purls:
continue

# Create new component
try:
purl_obj = PackageURL.from_string(dep.purl)
except Exception as e:
logger.warning(f"Invalid PURL {dep.purl}: {e}")
continue

component = Component(
type=ComponentType.LIBRARY,
name=dep.name,
version=dep.version,
purl=purl_obj,
)

bom.components.add(component)
added_count += 1
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bom.components is treated as optional (if bom.components checks), but later bom.components.add(component) assumes it’s always initialized. If the SBOM has no components and Bom.from_json(...) yields components=None, this will raise an AttributeError. Initialize bom.components to an empty collection before calling .add(...) (or use the library’s supported way to ensure the components collection exists).

Copilot uses AI. Check for mistakes.
Comment on lines +122 to +135
# Skip options like -r, -e, --index-url, etc.
if line.startswith("-"):
continue

# Handle environment markers (e.g., requests; python_version >= "3.6")
if ";" in line:
line = line.split(";")[0].strip()

# Parse the requirement line
name, version = self._parse_requirement_line(line)
if name:
deps[name] = version

return deps
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parser will mis-handle valid requirements.txt entries that don’t start with - but also aren’t simple name[extras]specifier forms, such as PEP 508 direct references (pkg @ https://...), URL/VCS requirements (git+https://...), or local paths. Those can end up being passed into pipdeptree --packages as “names”, causing pipdeptree to return nothing or fail. Consider using packaging.requirements.Requirement to parse lines and either (a) extract .name while ignoring direct-reference location, or (b) explicitly skip unsupported requirement formats with a debug log.

Copilot uses AI. Check for mistakes.
from ...tool_checks import check_tool_available
from ..models import DiscoveredDependency, normalize_python_package_name

_PIPDEPTREE_AVAILABLE, _PIPDEPTREE_PATH = check_tool_available("pipdeptree")
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PIPDEPTREE_PATH is computed but never used; both subprocess calls hardcode "pipdeptree". If check_tool_available returns a resolved path (or if PATH differs between checks and runtime), this can cause false positives/negatives. Use _PIPDEPTREE_PATH as the executable in both can_expand() and _run_pipdeptree() (or drop _PIPDEPTREE_PATH entirely if it’s not needed).

Suggested change
_PIPDEPTREE_AVAILABLE, _PIPDEPTREE_PATH = check_tool_available("pipdeptree")
_PIPDEPTREE_AVAILABLE, _ = check_tool_available("pipdeptree")

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +63
result = subprocess.run(
["pipdeptree", "--json-tree", "--warn", "silence"],
capture_output=True,
text=True,
timeout=30,
)
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PIPDEPTREE_PATH is computed but never used; both subprocess calls hardcode "pipdeptree". If check_tool_available returns a resolved path (or if PATH differs between checks and runtime), this can cause false positives/negatives. Use _PIPDEPTREE_PATH as the executable in both can_expand() and _run_pipdeptree() (or drop _PIPDEPTREE_PATH entirely if it’s not needed).

Copilot uses AI. Check for mistakes.
packages: Comma-separated list of package names to filter to.
If None, returns all packages in the environment.
"""
cmd = ["pipdeptree", "--json-tree", "--warn", "silence"]
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_PIPDEPTREE_PATH is computed but never used; both subprocess calls hardcode "pipdeptree". If check_tool_available returns a resolved path (or if PATH differs between checks and runtime), this can cause false positives/negatives. Use _PIPDEPTREE_PATH as the executable in both can_expand() and _run_pipdeptree() (or drop _PIPDEPTREE_PATH entirely if it’s not needed).

Suggested change
cmd = ["pipdeptree", "--json-tree", "--warn", "silence"]
cmd = [_PIPDEPTREE_PATH, "--json-tree", "--warn", "silence"]

Copilot uses AI. Check for mistakes.
logger.debug(f"Discovered {len(dependencies)} transitive dependencies")
return dependencies
except Exception as e:
logger.warning(f"Failed to expand {lock_file.name}: {e}")
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When catching a broad exception during expansion, logger.warning(...) drops the traceback, which makes diagnosing failures harder (especially since failures are intentionally non-fatal). Prefer logger.exception(...) here (or logger.warning(..., exc_info=True)) to preserve stack traces in logs.

Suggested change
logger.warning(f"Failed to expand {lock_file.name}: {e}")
logger.warning(f"Failed to expand {lock_file.name}: {e}", exc_info=True)

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +121
def _enrich_cyclonedx(
self,
sbom_path: Path,
sbom_data: dict[str, Any],
discovered: list[DiscoveredDependency],
source: str,
) -> ExpansionResult:
"""Add discovered dependencies to CycloneDX SBOM."""
bom = Bom.from_json(sbom_data)
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR adds substantial new enrichment behavior (CycloneDX/SPDX mutation, deduplication by PURL/name+version, component/package insertion, SPDXID generation/collision handling), but the added tests mainly cover parsing/registry/skip paths. Add tests that exercise _enrich_cyclonedx and _enrich_spdx end-to-end (e.g., empty/no-components input, deduplication when SBOM already contains the PURL, and verifying audit trail entries are created).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add piptree for requirements.txt processing

1 participant