Skip to content

Commit 2c9b4ab

Browse files
authored
Merge pull request #2909 from nexB/syspacfiles
Add system packages support in the new packages model
2 parents a1f3c12 + 8e07301 commit 2c9b4ab

File tree

1,264 files changed

+108563
-42924
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,264 files changed

+108563
-42924
lines changed

docs/source/contribute/contrib_dev.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,30 @@ These are enabled by adding a ``--test-suite`` option to the pytest command.
109109
extensive data-driven and data validations (for package, copyright and license
110110
detection)
111111

112+
In some cases we need to regenerate test data when expected behavious/result data
113+
structures change, and we have an environement variable to regenerate test data.
114+
`SCANCODE_REGEN_TEST_FIXTURES` is present in `scancode_config` and this can be
115+
set to regenerate test data for specific tests like this:
112116

117+
``SCANCODE_REGEN_TEST_FIXTURES=yes pytest -vvs tests/packagedcode/test_package_models.py``
118+
119+
This command will only regenerate test data for only the tests in `test_package_models.py`,
120+
and we can further specify the tests to regen by using more pytest options like `--lf` and
121+
`-k test_instances`.
122+
123+
If test data is regenerated, it is important to review the diff for test files and
124+
carefully go through all of it to make sure there are no unintended changes there,
125+
and then commit all the regenerated test data.
126+
127+
To help debug in scancode, we use logging. There are different environement variables
128+
you need to set to turn on logging. In packagedcode::
129+
130+
``SCANCODE_DEBUG_PACKAGE=yes pytest -vvs tests/packagedcode/ --lf``
131+
132+
Or set the ``TRACE`` variable to ``True``. This enables ``logger_debug`` functions
133+
logging variables and shows code execution paths by logging and printing the logs
134+
in the terminal. If debugging full scans run by click, you have to raise exceptions
135+
in addition to setting the TRACE to enable logging.
113136

114137
.. _scancode_toolkit_development_thirdparty_libraries:
115138

setup-mini.cfg

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,8 @@ install_requires =
6969
chardet >= 3.0.0
7070
click >= 6.7, !=7.0
7171
colorama >= 0.3.9
72-
commoncode >= 30.0.0
72+
commoncode >= 30.1.1
73+
container-inspector >= 30.0.0
7374
debian-inspector >= 30.0.0
7475
dparse2 >= 0.6.0
7576
fasteners
@@ -175,14 +176,13 @@ scancode_scan =
175176
# module for details and doc.
176177
scancode_post_scan =
177178
summary = summarycode.summarizer:ScanSummary
178-
summary2 = summarycode.summarizer2:ScanSummary
179-
summary-keeping-details = summarycode.summarizer:ScanSummaryWithDetails
180-
summary-key-files = summarycode.summarizer:ScanKeyFilesSummary
181-
summary-by-facet = summarycode.summarizer:ScanByFacetSummary
179+
tallies = summarycode.tallies:Tallies
180+
tallies-with-details = summarycode.tallies:TalliesWithDetails
181+
tallies-key-files = summarycode.tallies:KeyFilesTallies
182+
tallies-by-facet = summarycode.tallies:FacetTallies
182183
license-clarity-score = summarycode.score:LicenseClarityScore
183184
license-policy = licensedcode.plugin_license_policy:LicensePolicy
184185
mark-source = scancode.plugin_mark_source:MarkSource
185-
classify-package = summarycode.classify:PackageTopAndKeyFilesTagger
186186
is-license-text = licensedcode.plugin_license_text:IsLicenseText
187187
filter-clues = cluecode.plugin_filter_clues:RedundantCluesFilter
188188
consolidate = summarycode.plugin_consolidate:Consolidator

setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ install_requires =
7070
click >= 6.7, !=7.0
7171
colorama >= 0.3.9
7272
commoncode >= 30.1.1
73+
container-inspector >= 30.0.0
7374
debian-inspector >= 30.0.0
7475
dparse2 >= 0.6.0
7576
fasteners
@@ -184,7 +185,6 @@ scancode_post_scan =
184185
license-clarity-score = summarycode.score:LicenseClarityScore
185186
license-policy = licensedcode.plugin_license_policy:LicensePolicy
186187
mark-source = scancode.plugin_mark_source:MarkSource
187-
classify-package = summarycode.classify:PackageTopAndKeyFilesTagger
188188
is-license-text = licensedcode.plugin_license_text:IsLicenseText
189189
filter-clues = cluecode.plugin_filter_clues:RedundantCluesFilter
190190
consolidate = summarycode.plugin_consolidate:Consolidator

src/formattedcode/output_csv.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
# See https://github.com/nexB/scancode-toolkit for support or download.
77
# See https://aboutcode.org for more information about nexB OSS projects.
88
#
9+
import attr
910
import csv
1011

1112
import saneyaml
@@ -231,6 +232,8 @@ def get_package_columns(_columns=set()):
231232

232233
from packagedcode.models import PackageData
233234

235+
package_data_fields = [field.name for field in attr.fields(PackageData)]
236+
234237
# exclude some columns for now that contain list of items
235238
excluded_columns = {
236239
# list of strings
@@ -252,7 +255,7 @@ def get_package_columns(_columns=set()):
252255
'notice_url',
253256
]
254257

255-
fields = PackageData.fields() + extra_columns
258+
fields = package_data_fields + extra_columns
256259
_columns = set(f for f in fields if f not in excluded_columns)
257260
return _columns
258261

src/packagedcode/README.rst

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,19 +31,24 @@ Taking Python as a main example a package can exist in multiple forms:
3131
file type with metadata such as Windows DLLs. Additional markers may also include
3232
"namespaces" such as Java or Python imports, C/C++ namespace declarations.
3333

34-
2. **parse and collect the package manifest(s)** metadata. For Python, this means
34+
2. **parse and collect the package datafile or manifest(s)** metadata. For Python, this means
3535
extracting name, version, authorship, declared licensing and declared dependencies as
3636
found in the any of the package descriptor files (e.g. a `setup.py` file,
3737
`requirements` file(s) or any of the `*-dist-info` or `*-egg-info` dir files such as
38-
a `metadata.json`). Other package formats have their own metatada that may be more or
38+
a `metadata.json`). Other package datafile formats have their own metatada that may be more or
3939
less comprehensive in the breadth and depth of information they offer (e.g.
4040
`.nuspec`, `package.json`, `bower.json`, Godeps, etc...). These metadata include the
4141
declared dependencies (and in some cases the fully resolved dependencies too such as
4242
with Gemfile.lock). Finally, all the different packages formats and data are
4343
normalized and stored in a common data structure abstracting the small differences of
4444
naming and semantics that may exists between all the different package formats.
4545

46-
Once collected, these data are then injected in the `packages` section of the scan.
46+
Once collected, these data are then injected in the `package_data` section of a file scan
47+
for each recognized package datafile.
48+
49+
3. **assemble multiple package datafile** as top level packages.
50+
51+
4752

4853
What code in `packagedcode` is not meant to do:
4954

0 commit comments

Comments
 (0)