Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update syft license concept to complex struct #1743

Merged
merged 108 commits into from
May 15, 2023
Merged

Conversation

spiffcs
Copy link
Contributor

@spiffcs spiffcs commented Apr 17, 2023

Summary

Syft currently represents licenses for packages as part of an individual packages metadata (uncommon type). It also limits the expression of licenses as single string values. This metadata approach combined with the limited expression opportunities afforded by a string representation make it hard for downstream intuition tooling to parse, read, and evaluate license compliance for individual packages. Not every string is guaranteed to be a valid SPDX expression, and not every string at this point is guaranteed to be a valid license.

This PR attempts to make the following changes to update the underlying license model to have more expressive capabilities, while also providing some guarantee's surrounding the license values themselves:

  • Licenses are updated from string -> pkg.License struct with the following fields:
    • Value required
      • declared value pulled from the original source of the discovered packages license
      • This will make it easier for us to find where/how junk licenses are being pulled from
    • SPDXExpression optional
      • If it's possible to construct, this field will always contain a valid SPDX expression for downstream consumption
      • Downstream consumers can use this field as the basis for parsing SPDX expressions to grab the individual licenses
    • Type required
      • SPDX concluded vs declared
    • URL optional
      • URL source of declared/concluded license of online query was used
    • Location optional
      • syft's internal location representation showing the evidence of WHERE a license was discovered

Implementation Notes

  • Validate SPDX expressions and make licenses immutable on package creation
  • Remove Licenses data from individual metadata types and make it a first class field on Package
  • Use https://github.com/github/go-spdx for spdx expression/license validation for SPDXExpression field
  • Update catalogers to no longer use metadata as a pass through and instead build more complex constructors for individual package catalogers
  • Update SPDX && CycloneDX presenters to no longer need to validate/mutate License data (this should be handled at the individual package level)

Schema Diff Notes

62,64d61
<         "license": {
<           "type": "string"
<         },
96d92
<         "license",
139,141d134
<           "type": "string"
<         },
<         "license": {
190d182
<         "license",
607,612d598
<         "licenses": {
<           "items": {
<             "type": "string"
<           },
<           "type": "array"
<         },
753a740,765
>     "License": {
>       "properties": {
>         "value": {
>           "type": "string"
>         },
>         "spdx-expression": {
>           "type": "string"
>         },
>         "type": {
>           "type": "string"
>         },
>         "url": {
>           "type": "string"
>         },
>         "location": {
>           "$ref": "#/$defs/Location"
>         }
>       },
>       "type": "object",
>       "required": [
>         "value",
>         "spdx-expression",
>         "type",
>         "url"
>       ]
>     },
984,989d995
<         "licenses": {
<           "items": {
<             "type": "string"
<           },
<           "type": "array"
<         },
1008d1013
<         "licenses",
1055c1060
<             "type": "string"
---
>             "$ref": "#/$defs/License"
1263a1269,1274
>         },
>         "license": {
>           "items": {
>             "type": "string"
>           },
>           "type": "array"
1272,1277d1282
<           "items": {
<             "type": "string"
<           },
<           "type": "array"
<         },
<         "license": {
1492,1494d1496
<         "license": {
<           "type": "string"
<         },
1527d1528
<         "license",
1625,1627d1625
<         "license": {
<           "type": "string"
<         },
1650d1647
<         "license",

spiffcs added 23 commits March 21, 2023 12:46
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
^--^  ^------------^
|     |
|     +-> Summary in present tense.
|
+-------> Type: chore, docs, feat, fix, refactor, style, or test.

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
* main: (35 commits)
  Fix kernel cataloger test fixtures (#1742)
  feat: Support scanning license files in golang packages over the network (#1630)
  Add package-to-file location evidence relationships (#1698)
  Add Linux Kernel cataloger (#1694)
  Add annotations for evidence on package locations (#1723)
  add format make target (#1733)
  Update tests to not fail on Mac M1's. (#1730)
  chore(deps): update bootstrap tools to latest versions (#1728)
  Add support for nar files. (#1727)
  add highlevel details about catalogers (#1726)
  chore(deps): bump golang.org/x/net from 0.8.0 to 0.9.0 (#1722)
  chore(deps): update stereoscope to e95d60a265e384df29b7a139f5c5402d6ad72e06 (#1721)
  feat: gradle lockfile support (#1719)
  chore(deps): bump github.com/docker/docker (#1715)
  chore(deps): bump golang.org/x/mod from 0.9.0 to 0.10.0 (#1713)
  chore(deps): bump golang.org/x/term from 0.6.0 to 0.7.0 (#1714)
  chore(deps): bump github.com/spf13/cobra from 1.6.1 to 1.7.0 (#1716)
  chore(deps): bump peter-evans/create-pull-request from 4 to 5 (#1712)
  chore: update tools-golang to v0.5.0 (#1717)
  Add Nix cataloger (#1696)
  ...

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@github-actions
Copy link

github-actions bot commented Apr 17, 2023

Benchmark Test Results

Benchmark results from the latest changes vs base branch
goos: linux%0Agoarch: amd64%0Apkg: github.com/anchore/syft/test/integration%0Acpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz%0A                                                          │ ./.tmp/benchmark-b0a90fb.txt │%0A                                                          │            sec/op            │%0AImagePackageCatalogers/alpmdb-cataloger-2                                   11.90m ±  1%25%0AImagePackageCatalogers/apkdb-cataloger-2                                    634.4µ ±  1%25%0AImagePackageCatalogers/binary-cataloger-2                                   215.4µ ±  1%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                   546.0µ ±  1%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                              1.215m ±  3%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                         92.21µ ±  0%25%0AImagePackageCatalogers/java-cataloger-2                                     12.91m ± 19%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                     91.64µ ±  0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                       369.4µ ±  1%25%0AImagePackageCatalogers/nix-store-cataloger-2                                256.5µ ±  3%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                   734.0µ ±  1%25%0AImagePackageCatalogers/portage-cataloger-2                                  412.4µ ±  1%25%0AImagePackageCatalogers/python-package-cataloger-2                           3.143m ±  1%25%0AImagePackageCatalogers/r-package-cataloger-2                                178.2µ ±  1%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                   469.3µ ±  2%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                             847.6µ ±  0%25%0AImagePackageCatalogers/sbom-cataloger-2                                     118.0µ ±  1%25%0Ageomean                                                                     577.6µ%0A%0A                                                          │ ./.tmp/benchmark-b0a90fb.txt │%0A                                                          │             B/op             │%0AImagePackageCatalogers/alpmdb-cataloger-2                                   5.129Mi ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                    175.3Ki ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                   32.08Ki ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                   169.1Ki ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                              404.2Ki ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                         10.06Ki ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                     2.829Mi ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                     8.750Ki ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                       101.1Ki ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                49.15Ki ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                   186.5Ki ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                  120.0Ki ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                           1.005Mi ± 0%25%0AImagePackageCatalogers/r-package-cataloger-2                                53.38Ki ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                   181.2Ki ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                             144.4Ki ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                     14.21Ki ± 0%25%0Ageomean                                                                     132.3Ki%0A%0A                                                          │ ./.tmp/benchmark-b0a90fb.txt │%0A                                                          │          allocs/op           │%0AImagePackageCatalogers/alpmdb-cataloger-2                                    87.75k ± 0%25%0AImagePackageCatalogers/apkdb-cataloger-2                                     4.087k ± 0%25%0AImagePackageCatalogers/binary-cataloger-2                                     896.0 ± 0%25%0AImagePackageCatalogers/dpkgdb-cataloger-2                                    3.000k ± 0%25%0AImagePackageCatalogers/dotnet-deps-cataloger-2                               6.338k ± 0%25%0AImagePackageCatalogers/go-module-binary-cataloger-2                           281.0 ± 0%25%0AImagePackageCatalogers/java-cataloger-2                                      39.81k ± 0%25%0AImagePackageCatalogers/graalvm-native-image-cataloger-2                       228.0 ± 0%25%0AImagePackageCatalogers/javascript-package-cataloger-2                        1.405k ± 0%25%0AImagePackageCatalogers/nix-store-cataloger-2                                  895.0 ± 0%25%0AImagePackageCatalogers/php-composer-installed-cataloger-2                    4.079k ± 0%25%0AImagePackageCatalogers/portage-cataloger-2                                   2.267k ± 0%25%0AImagePackageCatalogers/python-package-cataloger-2                            16.43k ± 0%25%0AImagePackageCatalogers/r-package-cataloger-2                                  928.0 ± 0%25%0AImagePackageCatalogers/rpm-db-cataloger-2                                    3.989k ± 0%25%0AImagePackageCatalogers/ruby-gemspec-cataloger-2                              2.447k ± 0%25%0AImagePackageCatalogers/sbom-cataloger-2                                       394.0 ± 0%25%0Ageomean                                                                      2.590k

spiffcs added 4 commits April 19, 2023 13:23
* main:
  Add sections of interest for Gemfile.lock cataloger (#1749)
  fix: update cache.fingerprint file to java-builds dir (#1748)
  Add ALPM Metadata to CYCLONEDX and SPDX output formats (#1747)
  chore: bump stereoscope to latest version (#1741)
  chore(deps): update bootstrap tools to latest versions (#1744)
  chore(deps): bump github.com/docker/docker (#1746)
  Create consul binary classifier (#1738)
  chore(deps): update bootstrap tools to latest versions (#1740)

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
@spiffcs spiffcs force-pushed the 1577-license-revamp branch from b7e1847 to a0190d2 Compare May 12, 2023 19:08
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
spiffcs added 5 commits May 15, 2023 13:21
* main:
  fix: cyclonedx depends-on relationship inverted (#1816)
  fix: retain sbom cataloger relationships (#1509)
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
Copy link
Contributor

@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fantastic addition! 🙌

@spiffcs spiffcs merged commit 42fa9e4 into main May 15, 2023
@spiffcs spiffcs deleted the 1577-license-revamp branch May 15, 2023 20:23
spiffcs added a commit that referenced this pull request May 18, 2023
* main: (32 commits)
  chore(deps): bump github.com/google/go-containerregistry (#1823)
  chore(deps): bump github.com/sirupsen/logrus from 1.9.0 to 1.9.1 (#1822)
  chore(deps): bump github.com/docker/docker (#1824)
  fix: update field plurality of 8.0.0 schema before release (#1820)
  fix: update cataloger to check for expressions before split (#1819)
  feat: update syft license concept to complex struct (#1743)
  fix: cyclonedx depends-on relationship inverted (#1816)
  fix: retain sbom cataloger relationships (#1509)
  feat: warn if parsing newer SBOM (#1810)
  feat: Add R cataloger (#1790)
  update cosign to v2 release (different go module) (#1805)
  fix: Reduce log spam on unknown relationship type (#1797)
  chore(deps): update bootstrap tools to latest versions (#1807)
  chore(deps): bump golang.org/x/net from 0.9.0 to 0.10.0 (#1802)
  chore(deps): bump github.com/docker/docker (#1795)
  chore(deps): bump github.com/google/go-containerregistry (#1796)
  chore(deps): update bootstrap tools to latest versions (#1792)
  Print package list when extra packages found (#1791)
  chore(deps): update bootstrap tools to latest versions (#1786)
  chore(deps): bump golang.org/x/term from 0.7.0 to 0.8.0 (#1787)
  ...

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
this PR makes the following changes to update the underlying license model to have more expressive capabilities
it also provides some guarantee's surrounding the license values themselves

- Licenses are updated from string -> pkg.LicenseSet which contain pkg.License with the following fields:
- original `Value` read by syft
- If it's possible to construct licenses will always have a valid SPDX expression for downstream consumption
- the above is run against a generated list of SPDX license ID to try and find the correct ID
- SPDX concluded vs declared is added to the new struct
- URL source for license is added to the new struct
- Location source is added to the new struct to show where the expression was pulled from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support describing license properties and SPDX expression assertions
2 participants