Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide built-in support for SPDX and scancode license expression validation #56

Open
pombredanne opened this issue Jun 1, 2021 · 5 comments

Comments

@pombredanne
Copy link
Member

I would like to have a function that takes an expression string as an argument and validates this expression. It could be build from Licensing.parse() but I would prefer having it return some object that tells me everything about the expression validity:

  • if the syntax is valid or not and error messages if not
  • what are the valid and invalid license symbols
  • what are the valid and invalid exceptions
  • what are the obsolete license symbols

This function should be taking either the ScanCode license DB as an input for license symbols ( https://scancode-licensedb.aboutcode.org ) or some list of symbols. It should bundle an up-to-date licenses list from ScanCode and SPDX for easy bootstrapping. For this we need aboutcode-org/scancode-licensedb#7
In addition it should also support and accept arbitrary LicenseRef- (and possibly DocumentRef- ) in SPDX mode.

@pombredanne
Copy link
Member Author

@thatch @JonoYang ping ^

@pombredanne
Copy link
Member Author

Some example:

$ wget https://scancode-licensedb.aboutcode.org/index.json
$ python
>>> import json
>>> lics = json.load(open('index.json'))
>>> lics[0]
{'license_key': '389-exception', 'json': '389-exception.json', 'yml': '389-exception.yml', 'html': '389-exception.html', 'text': '389-exception.LICENSE'}
>>> from license_expression import LicenseSymbol, Licensing
>>> syms =[LicenseSymbol(l['license_key']) for l in lics] 
>>> ling=Licensing(symbols=syms)
>>> ling.parse('foo AND mit')
AND(LicenseSymbol('foo', is_exception=False), LicenseSymbol('mit', is_exception=False))
>>> ling.parse('foo AND mit', validate=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/licexp/tmp/lib/python3.6/site-packages/license_expression/__init__.py", line 453, in parse
    raise ExpressionError(msg)
license_expression.ExpressionError: Unknown license key(s): foo
>>> e=ling.parse('foo AND mit')
>>> e.symbols
{LicenseSymbol('foo', is_exception=False), LicenseSymbol('mit', is_exception=False)}

@JonoYang
Copy link
Member

JonoYang commented Jun 1, 2021

@pombredanne When we are parsing a license expression using Licensing().parse(), should the .parse() method be automatically able to determine whether or not an expression is an SPDX license expression or a scancode license expression or should there be a flag that tells the .parse() method what kind of license expression to expect?

@pombredanne
Copy link
Member Author

@JonoYang I think the new validation feature should be explicit about which license list is used as a base and there should be no guessing there about whether an expression is from scancode or from SPDX.

@thatch
Copy link

thatch commented Jun 1, 2021

In addition to validation, could you also provide a normalized (whitespace, case, parens) version of the string passed in?

JonoYang added a commit that referenced this issue Jun 1, 2021
    * Index SPDX license keys instead of scancode license keys
    * Modify code to do lookups using SPDX license keys

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 2, 2021
Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 3, 2021
    * Create functions that loads a Licensing object with SPDX licenses

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 3, 2021
    * Refactor validate() to call parse() rather than using the code from parse()

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 4, 2021
    * Return license validation results in ExpressionInfo object

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 4, 2021
Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 4, 2021
    * Make helper functions for loading license keys

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 4, 2021
    * Add test for get_license_key_info

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Set original license expression in ExpressionInfo
    * Set vendored licensedb info location as a global
    * Create function that loads license index json
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Set original license expression in ExpressionInfo
    * Set vendored licensedb info location as a global
    * Create function that loads license index json
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Add repr to ExpressionInfo class
    * Remove valid_symbols and valid_exception_symbols from ExpressionInfo
    * Update vendored licensedb index
    * Avoid indexing deprecated licenses
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Add repr to ExpressionInfo class
    * Remove valid_symbols and valid_exception_symbols from ExpressionInfo
    * Update vendored licensedb index
    * Avoid indexing deprecated licenses
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Add repr to ExpressionInfo class
    * Remove valid_symbols and valid_exception_symbols from ExpressionInfo
    * Update vendored licensedb index
    * Avoid indexing deprecated licenses
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Add repr to ExpressionInfo class
    * Remove valid_symbols and valid_exception_symbols from ExpressionInfo
    * Update vendored licensedb index
    * Avoid indexing deprecated licenses
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Add repr to ExpressionInfo class
    * Remove valid_symbols and valid_exception_symbols from ExpressionInfo
    * Update vendored licensedb index
    * Avoid indexing deprecated licenses
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 8, 2021
    * Add repr to ExpressionInfo class
    * Remove valid_symbols and valid_exception_symbols from ExpressionInfo
    * Update vendored licensedb index
    * Avoid indexing deprecated licenses
    * Update tests

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 9, 2021
    * Add new test that uses license exception as normal license key

Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jun 9, 2021
    * We keep track of invalid license symbols from syntax errors

Signed-off-by: Jono Yang <jyang@nexb.com>
pombredanne added a commit that referenced this issue Jun 10, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 10, 2021
Improve documentation strings and code format.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue May 10, 2022
Add black codestyle test for skeleton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants