Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New plugin architecture #885

Merged
merged 122 commits into from
Feb 7, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
122 commits
Select commit Hold shift + click to select a range
065d0a3
Support multiple options for pre-scan plugins #787
yashdsaraf Nov 16, 2017
6095fb1
Added MakeHuman Exception License
SaravananOffl Nov 14, 2017
4f0475c
formatted LICENSE file and removed rules
SaravananOffl Nov 14, 2017
5a304fd
Refactor post scan plugins so they can have command line options
Nov 23, 2017
f41bca6
Improve documentation of PostScanPlugin.process_results()
Nov 24, 2017
403b1b2
Rename get_click_options() to get_options()
Nov 24, 2017
5f4c626
Allow post-scan plugins to take options other than boolean flags
Nov 24, 2017
612cf68
Use a common BasePlugin class for all plugins #787
pombredanne Dec 10, 2017
b765432
Inline fileutils imports #787
pombredanne Jan 2, 2018
7f9b74c
Use proper import in prep for Python 3 #787
pombredanne Jan 2, 2018
cff89ad
Update help text test to match latest code #787
pombredanne Jan 2, 2018
75908e7
Use class method for get_plugin_options #787
pombredanne Jan 2, 2018
586d392
Use scans_cache_class not scan_cache_class #787
pombredanne Jan 2, 2018
66ec6dc
Extract function to get_cache_dir #787
pombredanne Jan 2, 2018
81780ea
Ensure all tests pass #787
pombredanne Jan 2, 2018
d9f8ec1
Remove junk print statements
pombredanne Jan 3, 2018
4f143b7
Streamline core CLI and plugins processing
pombredanne Jan 3, 2018
7ef1cba
Use resource_iter everywhere #787
pombredanne Jan 5, 2018
a6446c9
Do not use on-disk file log. Improve pre-scan plugins #787
pombredanne Jan 5, 2018
0c3625b
Add new extract_zip_raw test function
pombredanne Jan 5, 2018
9884c9f
Bump attrs and add typing library #787
pombredanne Jan 5, 2018
f8eabb3
Make info a regular scan and other CLI loop reorg #787
pombredanne Jan 6, 2018
40835fd
Remove diag arg from get_resources() #787
pombredanne Jan 6, 2018
3be4577
Add function to skip first or last iterable item #787
pombredanne Jan 10, 2018
b5e627c
Change interruptible returned values #787
pombredanne Jan 11, 2018
e9cd4e5
Ensure api function return a list #787
pombredanne Jan 11, 2018
59bef89
Use latest attrs and typing #787
pombredanne Jan 11, 2018
683ca26
Add file base_name to info scans #787
pombredanne Jan 12, 2018
0279764
Replace path_to_bytes/unicode by fsen/decode #787
pombredanne Jan 12, 2018
4ba3b77
Improve base plugins design #787
pombredanne Jan 12, 2018
9613d65
Refine scan utilities
pombredanne Jan 12, 2018
e891d07
Use fsencode/fsdecode throughout #787
pombredanne Jan 12, 2018
261fcb9
Improve Codebase and Resource implementation #787
pombredanne Jan 12, 2018
45cf069
Use a Codebase in pre-scan and post-scan plugins #787
pombredanne Jan 12, 2018
b1949e9
Resources are now topdown and sorted by name in outputs #787
pombredanne Jan 12, 2018
da79dc2
Improve plugins and codebase handling in cli #787
pombredanne Jan 12, 2018
024072c
Ensure SPDX RDF tests are always sorted #787
pombredanne Jan 12, 2018
c5014c7
Simplify Plugin init arguments
pombredanne Jan 12, 2018
1095bf3
Update test data #787
pombredanne Jan 16, 2018
60dc3fd
Update test to use new format options #787
pombredanne Jan 16, 2018
4d0d7e3
Update functions doc #787
pombredanne Jan 16, 2018
7428299
Add new timed function decorator to time execution #787
pombredanne Jan 16, 2018
6f1ddd2
Improve handling of re._MAXCACHE
pombredanne Jan 17, 2018
c6a0317
Restore correct SPDX output tests results #787
pombredanne Jan 17, 2018
f5c7a97
Add new plugin system for output #787 #789
pombredanne Jan 17, 2018
b160cd7
Add new output_filter stage and plugins #787
pombredanne Jan 17, 2018
a19d625
Add new housekeeping stage and plugins #787
pombredanne Jan 17, 2018
861e9a1
Add new scan stage and plugins #787 #552 #698
pombredanne Jan 17, 2018
5b5c85c
Add new output_filter stage and plugins #787
pombredanne Jan 17, 2018
c7ec297
Update pre and post-scan plugins to new architecture
pombredanne Jan 17, 2018
60bdb24
New plugin and CLi architecture #787 #552
pombredanne Jan 17, 2018
72bfdbd
Fix typo in configure message
pombredanne Jan 18, 2018
f66f13d
Update release script output options #787
pombredanne Jan 18, 2018
d69e959
Improve CLI options handling #787
pombredanne Jan 18, 2018
43cd8cb
Rename Pluggy impl spec from output to output_impl #787
pombredanne Jan 19, 2018
d7fb3ed
Fix spacing #787
pombredanne Jan 19, 2018
04159b9
Catch errors in the execution of scan stages #787
pombredanne Jan 19, 2018
a1c7053
Refine comment on temp file usage
pombredanne Jan 19, 2018
ab4a3a0
Remove housekeeping stage which is not needed #787
pombredanne Jan 19, 2018
66047ac
Implement new cache and temp_dir #685 #357
pombredanne Jan 23, 2018
2d4aa6b
Cosmetic
pombredanne Jan 23, 2018
73cea2a
Make license reindex use default cache for now #685 #357
pombredanne Jan 23, 2018
6285f0e
Add test for compute_counts
pombredanne Jan 23, 2018
533ff36
Use correct location for version file #685 #357
pombredanne Jan 23, 2018
43ba290
Use proper plugin name for only-findings
pombredanne Jan 23, 2018
f2fe20f
Use proper name for setup stage #787
pombredanne Jan 23, 2018
cefb0f6
Use proper name for setup stage #787
pombredanne Jan 23, 2018
92de6c6
Simplify only-findings codebase.walk #685 #357
pombredanne Jan 23, 2018
0b30649
Correct codebase.compute_counts #685 #357
pombredanne Jan 23, 2018
fae1e76
Do not reindex licenses during configure #685 #357
pombredanne Jan 24, 2018
01a87b6
Prefix all temp dirs with "scancode-" configure #685 #357
pombredanne Jan 24, 2018
45a080a
Use cache_dir and SCANCODE_DEV_MODE correctly #685 #357
pombredanne Jan 24, 2018
2c94a70
Add setup() to plugin_license #685 #357
pombredanne Jan 24, 2018
aeef8f9
Remove SCANCODE_DEBUG_LICENSE env var. Not used
pombredanne Jan 24, 2018
cdd4179
Use cache_dir and SCANCODE_DEV_MODE correctly #685 #357
pombredanne Jan 24, 2018
d00b75d
Improve license matching thresholds #889
pombredanne Jan 24, 2018
b34989c
Fix scan logging #787
pombredanne Jan 24, 2018
622c5a0
Compute stage and scan timing correctly #787
pombredanne Jan 24, 2018
69d2bc6
Ensure that the new --timing CLI options works #787
pombredanne Jan 24, 2018
ac28e69
Use posix paths for tests on all OSes #787
pombredanne Jan 24, 2018
9230d81
Make scancode_config._create_dir work on Win #685 #357
pombredanne Jan 24, 2018
017bc81
Add extra timeout for failing windows tests #685 #357
pombredanne Jan 24, 2018
0efe66c
Add extra timeout for failing windows tests #685 #357
pombredanne Jan 24, 2018
911e8cb
Add extra debug infor for failing windows tests #685 #357
pombredanne Jan 24, 2018
2efcd99
Ensure that to_dict works with not-set values #787
pombredanne Jan 25, 2018
ca2621f
Use less deep path on Windows
pombredanne Jan 25, 2018
8d77d0a
Shorten cpoyright test file names for Windows #787
pombredanne Jan 25, 2018
3639ac9
Correct test failures #787 #685 #357
pombredanne Jan 26, 2018
1d8a43d
Make multiprocessing working on Windows #787 #685 #357
pombredanne Jan 26, 2018
e7aabf1
Use NOQA tags consistently. Cleanup up dead code
pombredanne Jan 26, 2018
481f9e1
Improve CLI help text
pombredanne Jan 26, 2018
5c6eae5
Improve scan test run calls. Cleanup and format code
pombredanne Jan 26, 2018
e8d0df7
Rename test files to work on Windows
pombredanne Jan 26, 2018
9981c63
Do not use expectedFailure on XPASS tests
pombredanne Jan 26, 2018
352507b
Remove Windows-specific test timeouts #787
pombredanne Jan 26, 2018
228143a
Bump help for Windows Python version
pombredanne Jan 26, 2018
93ee491
Reset expectedFailure for Windows tests
pombredanne Jan 26, 2018
21f86a6
Add debug print for Windows test failure
pombredanne Jan 26, 2018
2938852
Avoid using multiple processes in tests #787
pombredanne Jan 26, 2018
715e39d
Update Windows test expectations for failures
pombredanne Jan 26, 2018
cf278a8
Fix typoe in docstring
pombredanne Jan 26, 2018
f66a1fd
Ensure some failing tests run verbosely
pombredanne Jan 26, 2018
74797bc
Fix Windows unicode path test expectations
pombredanne Jan 26, 2018
1a4284b
Fix tracing output
pombredanne Jan 26, 2018
0ad11fe
Re-enable license cache warmup
pombredanne Jan 26, 2018
403e80c
Remove unused plugin test mode code
pombredanne Jan 26, 2018
877c03e
Correct bug in display of non-ascii progressbar
pombredanne Jan 26, 2018
cec427d
Allow scan attributes to be direct Resource attributes #787
pombredanne Feb 1, 2018
29d289b
Correct formatting of CSV outputs
pombredanne Feb 1, 2018
01c635f
Do not regen in tests
pombredanne Feb 1, 2018
cfa3f56
Correct API tests
pombredanne Feb 1, 2018
37fc25d
Fix comment grammar
pombredanne Feb 1, 2018
dc1c997
Add new simple splitext function working from a name
pombredanne Feb 1, 2018
fc2176a
Correct splitext_name function name in doctests
pombredanne Feb 2, 2018
8b4e46d
Remove obsolete TODO
pombredanne Feb 5, 2018
b8faffd
Correct codebasse cache handling at boundaries
pombredanne Feb 6, 2018
e7a06ee
Add convenience methods to Resource
pombredanne Feb 6, 2018
5c9f963
Do not trace copyrights with nltk by default
pombredanne Feb 6, 2018
439ad76
Correct brokrn archives error reporting
pombredanne Feb 6, 2018
141f445
Improve user feedback for missing plugins #787
pombredanne Feb 6, 2018
cf31564
Update test expectation for Windows
pombredanne Feb 6, 2018
5784b20
Update test expectation for macOS
pombredanne Feb 6, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Refactor post scan plugins so they can have command line options
Signed-off-by: Haiko Schol <ext-haiko.schol@here.com>
  • Loading branch information
Haiko Schol authored and pombredanne committed Jan 17, 2018
commit 5a304fd1a49e85e96ff1355f1e1b43d2adae9531
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,8 +222,8 @@ def read(*names, **kwargs):
# becomes the ScanCode CLI boolean flag used to enable a
# given post_scan plugin
'scancode_post_scan': [
'only-findings = scancode.plugin_only_findings:process_only_findings',
'mark-source = scancode.plugin_mark_source:process_mark_source',
'only-findings = scancode.plugin_only_findings:OnlyFindings',
'mark-source = scancode.plugin_mark_source:MarkSource',
],

# scancode_pre_scan is an entry point to define pre_scan plugins.
Expand Down
31 changes: 25 additions & 6 deletions src/plugincode/post_scan.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,30 @@


@post_scan_spec
def post_scan_handler(active_scans, results):
class PostScanPlugin(object):
"""
Process the scanned files and yield the modified results.
Parameters:
- `active_scans`: a list of scanners names requested in the current run.
- `results`: an iterable of scan results for each file or directory.
A post-scan plugin layout class to be extended by the post_scan plugins.
"""
pass

def __init__(self, option, user_input):
self.option = option
self.user_input = user_input

def process_results(self, results, active_scans):
"""
Process the scan results.
results - an iterable of resources
active_scans - iterable of scanners that were used to obtain the results (e.g. "copyrights", "licenses")
"""
return results

@staticmethod
def get_click_options():
"""
Return an iterable of `click.Option` objects to be
used for calling the plugin.
"""
return ()


post_scan_plugins = PluginManager('post_scan')
Expand All @@ -57,6 +73,9 @@ def initialize():
NOTE: this defines the entry points for use in setup.py
"""
post_scan_plugins.load_setuptools_entrypoints('scancode_post_scan')
for name, plugin in get_post_scan_plugins().items():
if not issubclass(plugin, PostScanPlugin):
raise Exception('Invalid post-scan plugin "%(name)s": does not extend "plugincode.post_scan.PostScanPlugin".' % locals())


def get_post_scan_plugins():
Expand Down
17 changes: 9 additions & 8 deletions src/scancode/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -499,14 +499,15 @@ def scancode(ctx,

has_requested_post_scan_plugins = False

for option, post_scan_handler in plugincode.post_scan.get_post_scan_plugins().items():
is_requested = kwargs[option.replace('-', '_')]
if is_requested:
options['--' + option] = True
if not quiet:
echo_stderr('Running post-scan plugin: %(option)s...' % locals(), fg='green')
results = post_scan_handler(active_scans, results)
has_requested_post_scan_plugins = True
for name, plugin in plugincode.post_scan.get_post_scan_plugins().items():
for option in plugin.get_click_options():
user_input = kwargs[option.name]
if user_input:
options['--' + name] = user_input
if not quiet:
echo_stderr('Running post-scan plugin: %(option)s...' % locals(), fg='green')
results = plugin(option.name, user_input).process_results(results, active_scans)
has_requested_post_scan_plugins = True

if has_requested_post_scan_plugins:
# FIXME: computing len needs a list and therefore needs loading it all ahead of time
Expand Down
52 changes: 30 additions & 22 deletions src/scancode/plugin_mark_source.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,43 +28,51 @@

from os import path

from click import Option

from plugincode.post_scan import PostScanPlugin
from plugincode.post_scan import post_scan_impl


@post_scan_impl
def process_mark_source(active_scans, results):
class MarkSource(PostScanPlugin):
"""
Set the "is_source" flag to true for directories that contain
over 90% of source files as direct children.
Has no effect unless the --info scan is requested.
"""

# FIXME: this is forcing all the scan results to be loaded in memory
# and defeats lazy loading from cache
results = list(results)
def process_results(self, results, _):
# FIXME: this is forcing all the scan results to be loaded in memory
# and defeats lazy loading from cache
results = list(results)

# FIXME: we should test for active scans instead, but "info" may not
# be present for now. check if the first item has a file info.
has_file_info = 'type' in results[0]

# FIXME: we should test for active scans instead, but "info" may not
# be present for now. check if the first item has a file info.
has_file_info = 'type' in results[0]
if not has_file_info:
# just yield results untouched
for scanned_file in results:
yield scanned_file
return

if not has_file_info:
# just yield results untouched
# FIXME: this is an nested loop, looping twice on results
# TODO: this may not recusrively roll up the is_source flag, as we
# may not iterate bottom up.
for scanned_file in results:
if scanned_file['type'] == 'directory' and scanned_file['files_count'] > 0:
source_files_count = 0
for scanned_file2 in results:
if path.dirname(scanned_file2['path']) == scanned_file['path']:
if scanned_file2['is_source']:
source_files_count += 1
mark_source(source_files_count, scanned_file)
yield scanned_file
return

# FIXME: this is an nested loop, looping twice on results
# TODO: this may not recusrively roll up the is_source flag, as we
# may not iterate bottom up.
for scanned_file in results:
if scanned_file['type'] == 'directory' and scanned_file['files_count'] > 0:
source_files_count = 0
for scanned_file2 in results:
if path.dirname(scanned_file2['path']) == scanned_file['path']:
if scanned_file2['is_source']:
source_files_count += 1
mark_source(source_files_count, scanned_file)
yield scanned_file
@staticmethod
def get_click_options():
return [Option(('--mark-source',), is_flag=True)]


def mark_source(source_files_count, scanned_file):
Expand Down
26 changes: 17 additions & 9 deletions src/scancode/plugin_only_findings.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,25 +25,33 @@
from __future__ import absolute_import
from __future__ import unicode_literals

from click import Option

from plugincode.post_scan import PostScanPlugin
from plugincode.post_scan import post_scan_impl


@post_scan_impl
def process_only_findings(active_scans, results):
class OnlyFindings(PostScanPlugin):
"""
Only return files or directories with findings for the requested
scans. Files and directories without findings are omitted (not
considering basic file information as findings).
"""

# FIXME: this is forcing all the scan results to be loaded in memory
# and defeats lazy loading from cache. Only a different caching
# (e.g. DB) could work here.
# FIXME: We should instead use a generator or use a filter function
# that pass to the scan results loader iterator
for scanned_file in results:
if has_findings(active_scans, scanned_file):
yield scanned_file
def process_results(self, results, active_scans):
# FIXME: this is forcing all the scan results to be loaded in memory
# and defeats lazy loading from cache. Only a different caching
# (e.g. DB) could work here.
# FIXME: We should instead use a generator or use a filter function
# that pass to the scan results loader iterator
for scanned_file in results:
if has_findings(active_scans, scanned_file):
yield scanned_file

@staticmethod
def get_click_options():
return [Option(('--only-findings',), is_flag=True)]


def has_findings(active_scans, scanned_file):
Expand Down