Skip to content

Archived license files not included in (all) reports when using the ARTIFACT source code origin #10607

@sasa-boros-cp

Description

@sasa-boros-cp

Describe the bug

The license files contained in the source artifacts, found by the use of a custom pattern, are not included in the reported notice even though they are found by the archiver during the scan. Not just the plain text, but also PDF and HTML notices.

Steps to reproduce the behavior:

  1. Provide custom configuration of licenseFilePatterns, for e.g. include **/LICENSE*
  2. Run a scan on the repository that contains dependencies with source artifacts that include LICENSE files, in META-INF for e.g. - default scan already has archiver enabled
  3. Run a report on the scan result
  4. Check notice file - it doesn't include the file license info

Expected behavior

File licenses found by the archiver should be included in the final notice.

Console/log output

During scanning:

DEBUG org.ossreviewtoolkit.model.utils.FileArchiver - Adding 'META-INF/LICENSE' to archive.

I see that the LICENSE was found and included.

During reporting:

INFO  org.ossreviewtoolkit.model.utils.FileArchiver - Unarchived data for ArtifactProvenance(sourceArtifact=RemoteArtifact(url=https://repo.maven.apache.org/maven2/org/apache/tika/tika-core/3.2.0/tika-core-3.2.0-sources.jar, hash=Hash(value=328e6b6cbc9be6d5b31e38a34a3a90f9981b280a, algorithm=SHA-1))) to '/tmp/ort-LicenseInfoResolver-archive15339033700059155317' in 4.312064ms.

I can see the unarchiving happening, but no files getting picked up.

Environment

  • ORT version: ort-minimal:61.0.0
  • Java version: 21
  • OS: linux/amd64 (Docker)

config.yml:

ort:
  forceOverwrite: true
  licenseFilePatterns:
    licenseFilenames: [  "copying*",
                         "COPYING*",
                         "copyright",
                         "licence*",
                         "license*",
                         "*.licence",
                         "*.license",
                         "**/LICENCE*",
                         "**/LICENSE*",
                         "unlicence",
                         "unlicense"]
    patentFilenames: [ 'patents' ]
    otherLicenseFilenames: [ 'readme*' ]
  analyzer:
    skipExcluded: true
    enabled_package_managers: [ GradleInspector ]
  downloader:
    sourceCodeOrigins: [ARTIFACT, VCS]
    skipExcluded: true
  scanner:
    archive:
      enabled: true
      fileStorage:
        localFileStorage:
          directory: ${user.home}/.ort/scanner/archive
          compression: false
    # We don't want excluded stuff to be scanned, e.g. dev or test dependencies
    skipExcluded: true
    storages:
      local:
        backend:
          localFileStorage:
            directory: ${user.home}/.ort/scanner/results
            compression: false
      
    storageReaders: [local]
    storageWriters: [local]
  reporter:
    config:
      SpdxDocument:
        options:
          output.file.formats: json
          file.information.enabled: true
      CycloneDx:
        options:
          output.file.formats: json

Additional context

I am running in Docker, with a common volume for each ORT workflow step. I can see that the archiver archives the files in the right directory (during scan), as well as that unarchiving is happening during the report step. In the template, the package.licenseFiles is empty, and therefore no license is included in the final notice.

The Docker commands I am using:

Scan:

docker run \
  --rm \
  --name ort-scan-backend \
  --platform linux/amd64 \
  -v "$PWD"/:/workingdir \
  -v "$repo_abs_path":/repo \
  -v "$PWD"/.gitconfig:/home/ort/.gitconfig \
  -v ~/.netrc:/home/ort/.netrc:ro \
  -v ~/.gradle/gradle.properties:/home/ort/.gradle/gradle.properties:ro \
  -v "$ORT_DOCKER_VOLUME":/home/ort/.ort \
  -e ORT_CONFIG_DIR=/workingdir/config-backend \
  "$ORT_DOCKER_IMAGE" --debug scan \
  -i /repo/ort/results/backend/analyzer-result.yml \
  -o /repo/ort/results/backend \
  | tee scan-backend.log

Report:

docker run \
  --rm \
  --name ort-report-backend \
  --platform linux/amd64 \
  -v "$PWD"/:/workingdir \
  -v "$repo_abs_path":/repo \
  -v "$PWD"/.gitconfig:/home/ort/.gitconfig \
  -v ~/.netrc:/home/ort/.netrc:ro \
  -v ~/.gradle/gradle.properties:/home/ort/.gradle/gradle.properties:ro \
  -v "$ORT_DOCKER_VOLUME":/home/ort/.ort \
  -e ORT_CONFIG_DIR=/workingdir/config-backend \
  "$ORT_DOCKER_IMAGE" --debug report \
  -f PlainTextTemplate,StaticHtml,WebApp,PdfTemplate,HtmlTemplate,CycloneDx,SpdxDocument \
  -O PlainTextTemplate=template.path=/workingdir/config-general/notice.txt.ftl,/workingdir/config-general/notice.json.ftl \
  --license-classifications-file /workingdir/config-general/license-classifications.yml \
  -i /repo/ort/results/backend/evaluation-result.yml \
  -o /repo/ort/results/backend/reports/ \
  | tee report-backend.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    reporterAbout the reporter tool

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions