Skip to content

Commit e080f83

Browse files
authored
Merge pull request #2825 from nexB/2098-top-level-packages
Add Package Instances #2691 This PR adds the PackageInstance class and functions to group package manifests and package data as top level package instances. Existing package data are ported to this new approach. Reference: #2098 Reference: #2691 Reference: #2692 Reference: #2693 Reference: #2843 Reference: #2652 Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com> Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
2 parents 1dc4b61 + 376abc6 commit e080f83

File tree

1,155 files changed

+47981
-34234
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,155 files changed

+47981
-34234
lines changed

CHANGELOG.rst

Lines changed: 39 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Important API changes:
1414
instead under the ``venv`` subdirectory.
1515

1616
- Main package API function `get_package_infos` is now deprecated, and is
17-
replaced by `get_package_manifests`.
17+
replaced by `get_package_data`.
1818

1919
- The data structure of the JSON output has changed for copyrights, authors
2020
and holders: we now use proper name for attributes and not a generic "value".
@@ -27,10 +27,18 @@ Important API changes:
2727
as an option.
2828

2929
- The data structure of the JSON output has changed for packages: we now
30-
return "package_manifests" package information at the manifest file-level
31-
rather than "packages". There is a a new top-level "packages" attribute
32-
that contains each package instance that can be aggregating data from
33-
multiple manifests for a single package instance.
30+
return "package_data" package information at the manifest file-level
31+
rather than "packages". This has all the data attributes of a "package_data"
32+
field plus others: "package_uuid", "package_data_files" and "files".
33+
34+
- There is a a new top-level "packages" attribute that contains package
35+
instances that can be aggregating data from multiple manifests.
36+
37+
- There is a a new top-level "dependencies" attribute that contains each dependency
38+
instance, these can be standalone or releated to a package.
39+
40+
- There is a new resource-level attribute "for_packages" which refers to packages
41+
through package_uuids (pURL + uuid string).
3442

3543
- The data structure for HTML output has been changed to include emails and
3644
urls under the "infos" object. Now HTML template will output holders,
@@ -136,17 +144,31 @@ Package detection:
136144
- Yocto/BitBake .bb recipes.
137145

138146
- Major changes in packages detection and reporting, codebase-level attribute `packages`
139-
with one or more "package_manifests" and files for the packages are reported.
147+
with one or more `package_data` and files for the packages are reported.
140148
The specific changes made are:
141149

142-
- The resource level attribute `packages` has been renamed to `package_manifests`,
143-
as these are really package manifests that are being detected.
150+
- The resource level attribute `packages` has been renamed to `package_data`,
151+
as these are really package data that are being detected, and can be manifests,
152+
lockfiles or other package data. This has all the data attributes of a `package_data`
153+
field plus others: `package_uuid`, `package_data_files` and `files`.
154+
144155

145156
- A new top-level attribute `packages` has been added which contains package
146-
instances created from package_manifests detected in the codebase.
157+
instances created from `package_data` detected in the codebase.
158+
159+
- A new codebase level attribute `dependencies` has been added which contains dependency
160+
instances created from lockfiles detected in the codebase.
147161

148-
- A new codebase level attribute `packages` has been added which contains package
149-
instances created from package_manifests detected in the codebase.
162+
- The package attribute `root_path` has been deleted from `package_data` in favour
163+
of the new format where there is no root conceptually, just a list of files for each
164+
package.
165+
166+
- There is a new resource-level attribute `for_packages` which refers to packages
167+
through package_uuids (pURL + uuid string).
168+
169+
- The package_data attribute `dependencies` (which is a list of DependentPackages),
170+
now has a new attribute `resolved_package` having a package data mapping.
171+
Also the `requirement` attribute here is renamed to `extracted_requirement`.
150172

151173

152174
Outputs:
@@ -159,16 +181,19 @@ Outputs:
159181
Output version
160182
--------------
161183

162-
Scancode Data Output Version is now 2.0.0.
184+
Scancode Data Output Version is now 3.0.0.
163185

164186
Changes:
165187

166-
- rename resource level attribute `packages` to `package_manifests`.
188+
- rename resource level attribute `packages` to `package_data`.
167189
- add top-level attribute `packages`.
168-
190+
- add top-level attribute `dependencies`.
191+
- add resource-level attribute `for_packages`.
192+
- remove `package-data` attribute `root_path`.
169193

170194
Documentation Update
171195
~~~~~~~~~~~~~~~~~~~~~~~~
196+
172197
- Various documentations have been updated to reflects API changes and
173198
correct minor documentation issues.
174199

src/formattedcode/output_csv.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ def collect_keys(mapping, key_group):
194194
collect_keys(url_info, 'url')
195195
yield url_info
196196

197-
for package in scanned_file.get('package_manifests', []):
197+
for package in scanned_file.get('package_data', []):
198198
flat = flatten_package(package, path)
199199
collect_keys(flat, 'package')
200200
yield flat
@@ -229,7 +229,7 @@ def get_package_columns(_columns=set()):
229229
if _columns:
230230
return _columns
231231

232-
from packagedcode.models import Package
232+
from packagedcode.models import PackageData
233233

234234
# exclude some columns for now that contain list of items
235235
excluded_columns = {
@@ -252,7 +252,7 @@ def get_package_columns(_columns=set()):
252252
'notice_url',
253253
]
254254

255-
fields = Package.fields() + extra_columns
255+
fields = PackageData.fields() + extra_columns
256256
_columns = set(f for f in fields if f not in excluded_columns)
257257
return _columns
258258

src/formattedcode/output_cyclonedx.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -268,7 +268,7 @@ def from_package(cls, package):
268268
properties.append(
269269
CycloneDxProperty(
270270
name='WARNING',
271-
value=f'WARNING: component skipped in CycloneDX output: {self!r}'
271+
value=f'WARNING: component skipped in CycloneDX output: {package!r}'
272272
)
273273
)
274274

src/formattedcode/output_html.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ def generate_output(results, version, template):
154154

155155
LICENSES = 'licenses'
156156
COPYRIGHTS = 'copyrights'
157-
PACKAGES = 'package_manifests'
157+
PACKAGES = 'package_data'
158158

159159
# Create a flattened data dict keyed by path
160160
for scanned_file in results:
@@ -207,7 +207,7 @@ def generate_output(results, version, template):
207207
files = {
208208
'license_copyright': converted,
209209
'infos': converted_infos,
210-
'package_manifests': converted_packages
210+
'package_data': converted_packages
211211
}
212212

213213
return template.generate(files=files, licenses=licenses, version=version)

src/formattedcode/templates/html/template.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@
233233
</table>
234234
{% endif %}
235235

236-
{% if files.package_manifests %}
236+
{% if files.package_data %}
237237
<table>
238238
<caption>Package Information</caption>
239239
<thead>
@@ -245,7 +245,7 @@
245245
</tr>
246246
</thead>
247247
<tbody>
248-
{% for path, data in files.package_manifests.items() %}
248+
{% for path, data in files.package_data.items() %}
249249
{% for row in data %}
250250
<tr>
251251
<td>{{ path }}</td>

src/packagedcode/__init__.py

Lines changed: 39 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
from packagedcode import debian
1919
from packagedcode import conda
2020
from packagedcode import cocoapods
21+
from packagedcode import cran
2122
from packagedcode import freebsd
2223
from packagedcode import golang
2324
from packagedcode import haxe
@@ -40,15 +41,15 @@
4041

4142
# Note: the order matters: from the most to the least specific
4243
# Package classes MUST be added to this list to be active
43-
PACKAGE_MANIFEST_TYPES = [
44+
PACKAGE_DATA_CLASSES = [
4445
rpm.RpmManifest,
4546
debian.DebianPackage,
4647

4748
models.JavaJar,
4849
jar_manifest.JavaManifest,
4950
models.JavaEar,
5051
models.JavaWar,
51-
maven.MavenPomPackage,
52+
maven.PomXml,
5253
jar_manifest.IvyJar,
5354
models.JBossSar,
5455
models.Axis2Mar,
@@ -102,38 +103,50 @@
102103
build.BuckPackage,
103104
build.AutotoolsPackage,
104105
conda.Condayml,
105-
win_pe.WindowsExecutableManifest,
106+
win_pe.WindowsExecutable,
106107
readme.ReadmeManifest,
107108
build.MetadataBzl,
108109
msi.MsiInstallerPackage,
109110
windows.MicrosoftUpdateManifest,
110111
pubspec.PubspecYaml,
111112
pubspec.PubspecLock,
112-
build_gradle.BuildGradle,
113+
cran.DescriptionFile,
114+
build_gradle.BuildGradle
113115
]
114116

115-
PACKAGE_MANIFESTS_BY_TYPE = {
116-
(
117-
cls.package_manifest_type
118-
if isinstance(cls, models.PackageManifest)
119-
else cls.default_type
120-
): cls
121-
for cls in PACKAGE_MANIFEST_TYPES
117+
118+
PACKAGE_INSTANCE_CLASSES = [
119+
rpm.RpmPackage,
120+
maven.MavenPackage,
121+
npm.NpmPackage,
122+
phpcomposer.PhpPackage,
123+
haxe.HaxePackage,
124+
cargo.RustPackage,
125+
cocoapods.CocoapodsPackage,
126+
opam.OpamPackage,
127+
bower.BowerPackage,
128+
freebsd.FreebsdPackage,
129+
rubygems.RubyPackage,
130+
pypi.PythonPackage,
131+
golang.GoPackage,
132+
nuget.NugetPackage,
133+
chef.ChefPackage,
134+
win_pe.WindowsPackage,
135+
pubspec.PubspecPackage,
136+
cran.CranPackage
137+
]
138+
139+
140+
PACKAGE_DATA_BY_TYPE = {
141+
cls.default_type: cls
142+
for cls in PACKAGE_DATA_CLASSES
143+
}
144+
145+
146+
PACKAGE_INSTANCES_BY_TYPE = {
147+
cls.default_type: cls
148+
for cls in PACKAGE_INSTANCE_CLASSES
122149
}
123-
# We cannot have two package classes with the same type
124-
if len(PACKAGE_MANIFESTS_BY_TYPE) != len(PACKAGE_MANIFEST_TYPES):
125-
seen_types = {}
126-
for pmt in PACKAGE_MANIFEST_TYPES:
127-
manifest = pmt()
128-
assert manifest.package_manifest_type
129-
seen = seen_types.get(manifest.package_manifest_type)
130-
if seen:
131-
msg = ('Invalid duplicated packagedcode.Package types: '
132-
'"{}:{}" and "{}:{}" have the same type.'
133-
.format(manifest.package_manifest_type, manifest.__name__, seen.package_manifest_type, seen.__name__,))
134-
raise Exception(msg)
135-
else:
136-
seen_types[manifest.package_manifest_type] = manifest
137150

138151

139152
def get_package_class(scan_data, default=models.Package):
@@ -159,7 +172,7 @@ def get_package_class(scan_data, default=models.Package):
159172
if not ptype:
160173
# basic type for default package types
161174
return default
162-
ptype_class = PACKAGE_MANIFESTS_BY_TYPE.get(ptype)
175+
ptype_class = PACKAGE_DATA_BY_TYPE.get(ptype)
163176
return ptype_class or default
164177

165178

src/packagedcode/about.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
# TODO: Override get_package_resource so it returns the Resource that the ABOUT file is describing
3030

3131
@attr.s()
32-
class AboutPackage(models.Package):
32+
class AboutPackageData(models.PackageData):
3333

3434
default_type = 'about'
3535

@@ -44,13 +44,13 @@ def get_package_root(self, manifest_resource, codebase):
4444

4545

4646
@attr.s()
47-
class Aboutfile(AboutPackage, models.PackageManifest):
47+
class Aboutfile(AboutPackageData, models.PackageDataFile):
4848

4949
file_patterns = ('*.ABOUT',)
5050
extensions = ('.ABOUT',)
5151

5252
@classmethod
53-
def is_manifest(cls, location):
53+
def is_package_data_file(cls, location):
5454
"""
5555
Return True if the file at ``location`` is likely a manifest of this type.
5656
"""

src/packagedcode/alpine.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727

2828

2929
@attr.s()
30-
class AlpinePackage(models.Package, models.PackageManifest):
30+
class AlpinePackage(models.PackageData, models.PackageDataFile):
3131
extensions = ('.apk', 'APKBUILD')
3232
default_type = 'alpine'
3333

@@ -40,7 +40,7 @@ def compute_normalized_license(self):
4040
return detected
4141

4242
def to_dict(self, _detailed=False, **kwargs):
43-
data = models.Package.to_dict(self, **kwargs)
43+
data = super().to_dict(**kwargs)
4444
if _detailed:
4545
#################################################
4646
data['installed_files'] = [istf.to_dict() for istf in (self.installed_files or [])]
@@ -891,7 +891,7 @@ def D_dependencies_handler(value, dependencies=None, **kwargs):
891891
dependency = models.DependentPackage(
892892
purl=purl,
893893
scope=scope,
894-
requirement=requirement,
894+
extracted_requirement=requirement,
895895
is_resolved=is_resolved,
896896
)
897897
if dependency not in dependencies:

src/packagedcode/bower.py

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030

3131

3232
@attr.s()
33-
class BowerPackage(models.Package):
33+
class BowerPackageData(models.PackageData):
3434

3535
default_type = 'bower'
3636

@@ -43,13 +43,13 @@ def compute_normalized_license(self):
4343

4444

4545
@attr.s()
46-
class BowerJson(BowerPackage, models.PackageManifest):
46+
class BowerJson(BowerPackageData, models.PackageDataFile):
4747

4848
file_patterns = ('bower.json', '.bower.json')
4949
extensions = ('.json',)
5050

5151
@classmethod
52-
def is_manifest(cls, location):
52+
def is_package_data_file(cls, location):
5353
"""
5454
Return True if the file at ``location`` is likely a manifest of this type.
5555
"""
@@ -114,7 +114,7 @@ def recognize(cls, location):
114114
models.DependentPackage(
115115
purl=PackageURL(type='bower', name=dep_name).to_string(),
116116
scope='dependencies',
117-
requirement=requirement,
117+
extracted_requirement=requirement,
118118
is_runtime=True,
119119
is_optional=False,
120120
)
@@ -126,7 +126,7 @@ def recognize(cls, location):
126126
models.DependentPackage(
127127
purl=PackageURL(type='bower', name=dep_name).to_string(),
128128
scope='devDependencies',
129-
requirement=requirement,
129+
extracted_requirement=requirement,
130130
is_runtime=False,
131131
is_optional=True,
132132
)
@@ -145,6 +145,20 @@ def recognize(cls, location):
145145
)
146146

147147

148+
@attr.s()
149+
class BowerPackage(BowerPackageData, models.Package):
150+
"""
151+
A Bower Package that is created out of one/multiple bower package
152+
manifests and package-like data, with it's files.
153+
"""
154+
155+
@property
156+
def manifests(self):
157+
return [
158+
BowerJson
159+
]
160+
161+
148162
def compute_normalized_license(declared_license):
149163
"""
150164
Return a normalized license expression string detected from a list of

0 commit comments

Comments
 (0)