-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate license values #296
base: master
Are you sure you want to change the base?
Conversation
b520163
to
eac6492
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great functionality! I was/am irritated by license values inconsistently named when I'm working on things like ros-infrastructure/rospkg#201. I like the direction of the feature is going. I'm not a maintainer and not so sure about the direction of the implementation though.
The list was created from https://spdx.org/licenses/ with: | ||
cat doc/spdx-3.10-2020-08-03.csv | cut -f 2 | grep -v ^Identifier$ | ||
""" | ||
return lic in ['0BSD', 'AAL', 'Abstyles', 'Adobe-2006', 'Adobe-Glyph', 'ADSL', 'AFL-1.1', 'AFL-1.2', 'AFL-2.0', 'AFL-2.1', 'AFL-3.0', 'Afmparse', 'AGPL-1.0-only', 'AGPL-1.0-or-later', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have an answer, but I'm afraid embedding a (huge) list of license names can be a maintenance burden (I saw you described how you made it, but still would require manual invocation in the future when updating the list is needed. Also as a disclaimer, I'm not a maintainer on this repo). Wish there's a standalone tool that provides such a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, my hope is that this list is only temporary and component owners will start using valid identifiers in package.xml and eventually only the validation function will stay and all the replacements won't be needed. So the only maintenance should be to drop it in some far future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the list isn't complete (and cannot be), because on of the most common issues is using license like LGPL without a version, but here without checking the actual component source we cannot replace it with anything more accurate, all we can do is to show and warning and hope that the owner will improve the package.xml.
I still plan to extend the list to cover most common cases of listing multiple licenses in one tag instead of using multiple of them - just using & as separator - together with slightly different warning message suggesting to split them into multiple tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, my hope is that this list is only temporary and component owners will start using valid identifiers in package.xml and eventually only the validation function will stay and all the replacements won't be needed. So the only maintenance should be to drop it in some far future.
that would be great, but I would say this is not something we can expect (ie, the: "component owners will start using valid identifiers in package.xml and eventually only the validation function will stay and all the replacements won't be needed" part).
As of today, there are 5946 packages in the ros/rosdistro
db alone, with many more not registered anywhere (ie: hosted somewhere publicly but not part of a distribution). Many of those packages are old and will not be updated, but even the more recent ones will have a significant nr which will not be updated.
If you add this list, I expect it to be necessary for at least until Noetic EOLs, and probably also for a few ROS 2 versions (if you intend to add this functionality to a ROS 2 tool as well). We're talking 5+ years here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the far future for removal, but in the meantime we shouldn't need to update the list.
The replacements are just to have some immediate improvements where possible, while showing the corresponding SPDX in warning message. And not to regress from what superflore currently does with regexp replacements.
It's not meant to be perfect and it cannot be without parsing the actual source files with something like scancode-toolkit, because in some cases the valid identifier in package.xml doesn't match with the actual license and catkin_pkg nor superflore could detect that (even to show another warning).
This seems like it might be better suited for inclusion in catkin_lint. Any thoughts on that? |
I've added it to catkin_pkg only because rosdistro CI github action uses it to validate version, so I've used the same mechanism for license validation, but if more people will notice the warnings from catkin_lint then definitely it should be there (as well). |
@shr-project Atm the PR is in draft state and has no description. Therefore I am not going to review the patch. Please add a very detailed description about what the patch is doing, why it is doing so, what the side effects are, why you think those are justified, etc. On a first look this adds new warnings to a lot of existing packages. This on its own for already released distros is an almost guaranteed no-go. |
* catch ambiguous and invalid values while validating package.xml instead of trying to fix the values in superflore ros-infrastructure/superflore#271 Signed-off-by: Martin Jansa <martin.jansa@lge.com>
…AGE-LICENSE Signed-off-by: Martin Jansa <martin.jansa@lge.com>
…statistics * see the license value statistics in: ros-infrastructure/superflore#271 (comment) * with this applied, there are following statistics across all currently used ROS distributions in rosdistro: * License values which were unambiguously mapped to one of SPDX identifiers: 1064 WARNING: The license value "Apache License 2.0" is not valid SPDX identifier, please use "Apache-2.0" instead 741 WARNING: The license value "Apache 2.0" is not valid SPDX identifier, please use "Apache-2.0" instead 77 WARNING: The license value "LGPLv3" is not valid SPDX identifier, please use "LGPL-3.0-only" instead 75 WARNING: The license value "GPLv3" is not valid SPDX identifier, please use "GPL-3.0-only" instead 73 WARNING: The license value "BSD 3-Clause" is not valid SPDX identifier, please use "BSD-3-Clause" instead 34 WARNING: The license value "GPLv2" is not valid SPDX identifier, please use "GPL-2.0-only" instead 34 WARNING: The license value "BSD-3" is not valid SPDX identifier, please use "BSD-3-Clause" instead 26 WARNING: The license value "Apache 2" is not valid SPDX identifier, please use "Apache-2.0" instead 23 WARNING: The license value "Apache License, Version 2.0" is not valid SPDX identifier, please use "Apache-2.0" instead 21 WARNING: The license value "Apache2" is not valid SPDX identifier, please use "Apache-2.0" instead 14 WARNING: The license value "zlib" is not valid SPDX identifier, please use "Zlib" instead 10 WARNING: The license value "APACHE2.0" is not valid SPDX identifier, please use "Apache-2.0" instead 8 WARNING: The license value "GNU Lesser Public License 2.1" is not valid SPDX identifier, please use "LGPL-2.1-only" instead 6 WARNING: The license value "LGPLv2.1" is not valid SPDX identifier, please use "LGPL-2.1-only" instead 6 WARNING: The license value "CC BY-NC-SA 4.0" is not valid SPDX identifier, please use "CC-BY-NC-SA-4.0" instead 6 WARNING: The license value "BSD2" is not valid SPDX identifier, please use "BSD-2-Clause" instead 5 WARNING: The license value "LGPL-2.1" is not valid SPDX identifier, please use "LGPL-2.1-only" instead 5 WARNING: The license value "Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License" is not valid SPDX identifier, please use "CC-BY-NC-ND-4.0" instead 4 WARNING: The license value "zlib License" is not valid SPDX identifier, please use "Zlib" instead 4 WARNING: The license value "LGPL v2.1" is not valid SPDX identifier, please use "LGPL-2.1-only" instead 4 WARNING: The license value "GNU General Public License v2.0" is not valid SPDX identifier, please use "GPL-2.0-only" instead 4 WARNING: The license value "Eclipse Public License 2.0" is not valid SPDX identifier, please use "EPL-2.0" instead 4 WARNING: The license value "Creative Commons BY-NC-ND 3.0" is not valid SPDX identifier, please use "CC-BY-NC-ND-3.0" instead 4 WARNING: The license value "Boost Software License" is not valid SPDX identifier, please use "BSL-1.0" instead 3 WARNING: The license value "Mozilla Public License Version 1.1" is not valid SPDX identifier, please use "MPL-1.1" instead 3 WARNING: The license value "CreativeCommons-by-nc-sa-2.0" is not valid SPDX identifier, please use "CC-BY-NC-SA-2.0" instead 3 WARNING: The license value "Boost Software License, Version 1.0" is not valid SPDX identifier, please use "BSL-1.0" instead 2 WARNING: The license value "LGPL3" is not valid SPDX identifier, please use "LGPL-3.0-only" instead 2 WARNING: The license value "ECL2.0" is not valid SPDX identifier, please use "EPL-2.0" instead 2 WARNING: The license value "CreativeCommons-by-nc-4.0" is not valid SPDX identifier, please use "CC-BY-NC-4.0" instead 2 WARNING: The license value "Boost" is not valid SPDX identifier, please use "BSL-1.0" instead 2 WARNING: The license value "Boost Software License 1.0" is not valid SPDX identifier, please use "BSL-1.0" instead 2 WARNING: The license value "BSL1.0" is not valid SPDX identifier, please use "BSL-1.0" instead 2 WARNING: The license value "BSD 2-Clause License" is not valid SPDX identifier, please use "BSD-2-Clause" instead 2 WARNING: The license value "Apache2.0" is not valid SPDX identifier, please use "Apache-2.0" instead 2 WARNING: The license value "Apache v2.0" is not valid SPDX identifier, please use "Apache-2.0" instead 2 WARNING: The license value "Apache v2" is not valid SPDX identifier, please use "Apache-2.0" instead 2 WARNING: The license value "Apache License, Version 2.0 (http://www.apache.org/licenses/LICENSE-2.0)" is not valid SPDX identifier, please use "Apache-2.0" instead 1 WARNING: The license value "MIT License" is not valid SPDX identifier, please use "MIT" instead 1 WARNING: The license value "LGPL v2.1 or later" is not valid SPDX identifier, please use "LGPL-2.1-or-later" instead 1 WARNING: The license value "LGPL v2" is not valid SPDX identifier, please use "LGPL-2.0-only" instead 1 WARNING: The license value "GPL-2.0" is not valid SPDX identifier, please use "GPL-2.0-only" instead 1 WARNING: The license value "GPL v3" is not valid SPDX identifier, please use "GPL-3.0-only" instead 1 WARNING: The license value "GNU GPL v3.0" is not valid SPDX identifier, please use "GPL-3.0-only" instead 1 WARNING: The license value "CreativeCommons-Attribution-NonCommercial-ShareAlike-4.0-International" is not valid SPDX identifier, please use "CC-BY-NC-SA-4.0" instead 1 WARNING: The license value "CreativeCommons-Attribution-NonCommercial-NoDerivatives-4.0" is not valid SPDX identifier, please use "CC-BY-NC-ND-4.0" instead 1 WARNING: The license value "BSD 3-clause. See license attached" is not valid SPDX identifier, please use "BSD-2-Clause" instead 1 WARNING: The license value "BSD 3-clause Clear License" is not valid SPDX identifier, please use "BSD-2-Clause" instead 1 WARNING: The license value "Apachi 2" is not valid SPDX identifier, please use "Apache-2.0" instead 1 WARNING: The license value "Apache License Version 2.0" is not valid SPDX identifier, please use "Apache-2.0" instead * License texts which were replaced with more common version Biggest issue is clearly the "TODO" string from catkin package template which people forget to update 31 WARNING: The license value "TODO" is not valid SPDX identifier, and it is usually used as "TODO-CATKIN-PACKAGE-LICENSE" 6 WARNING: The license value "proprietary" is not valid SPDX identifier, and it is usually used as "Proprietary" 6 WARNING: The license value "Public domain" is not valid SPDX identifier, and it is usually used as "PD" 5 WARNING: The license value "Public Domain" is not valid SPDX identifier, and it is usually used as "PD" * License texts which weren't mapped to SPDX, usually because the license version wasn't specified or when some more creative form of license description was used Biggest issue is clearly the "BSD" without Clause specification followed by recipes using multiple licenses while not using clear separator between them (e.g. OpenEmbedded supports '&' '|' '(' ')': http://git.openembedded.org/openembedded-core/tree/meta/lib/oe/license.py?id=8e2d0575e4e7036b5f60e632f377a8ab2b96ead8#n42 ) 4711 WARNING: The license value "BSD" cannot be mapped to valid SPDX identifier 81 WARNING: The license value "LGPL" cannot be mapped to valid SPDX identifier 63 WARNING: The license value "GPL" cannot be mapped to valid SPDX identifier 31 WARNING: The license value "TODO" cannot be mapped to valid SPDX identifier 20 WARNING: The license value "United States Government Purpose" cannot be mapped to valid SPDX identifier 20 WARNING: The license value "SwRI Proprietary" cannot be mapped to valid SPDX identifier 18 WARNING: The license value "Apache" cannot be mapped to valid SPDX identifier 16 WARNING: The license value "ASL 2.0" cannot be mapped to valid SPDX identifier 14 WARNING: The license value "EPL" cannot be mapped to valid SPDX identifier 10 WARNING: The license value "GNU Lesser General Public License (LGPL)" cannot be mapped to valid SPDX identifier 8 WARNING: The license value "Proprietary" cannot be mapped to valid SPDX identifier 7 WARNING: The license value "BSD,LGPL,Apache 2.0" cannot be mapped to valid SPDX identifier 7 WARNING: The license value "BSD, LGPL" cannot be mapped to valid SPDX identifier 7 WARNING: The license value "BSD, Apache 2.0" cannot be mapped to valid SPDX identifier 6 WARNING: The license value "proprietary" cannot be mapped to valid SPDX identifier 6 WARNING: The license value "Public domain" cannot be mapped to valid SPDX identifier 6 WARNING: The license value "Creative Commons" cannot be mapped to valid SPDX identifier 6 WARNING: The license value "BSD, GPL" cannot be mapped to valid SPDX identifier 5 WARNING: The license value "Public Domain" cannot be mapped to valid SPDX identifier 4 WARNING: The license value "TBD" cannot be mapped to valid SPDX identifier 4 WARNING: The license value "CC-BY-SA" cannot be mapped to valid SPDX identifier 4 WARNING: The license value "BSD License 2.0" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "N/A" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "HOYA License" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "HEBI C++ Software License (https://www.hebirobotics.com/softwarelicense)" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "GPLv2 with linking exception" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "BSD,LGPL,LGPL (amcl)" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "BSD, some icons are licensed under the GNU Lesser General Public License (LGPL) or Creative Commons Attribution-Noncommercial 3.0 License" cannot be mapped to valid SPDX identifier 3 WARNING: The license value "ALv2" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "Yujin Robot" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "TERMS OF USE FOR GUNDAM RESEARCH OPEN SIMULATOR Attribution-NonCommercial-ShareAlike" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "Southwest Research Institute Proprietary" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "KHI CAD license (mesh data, see readme)" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "GPL for sigblock" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "GPL because of list.h; other files released under BSD" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "Eclipse Distribution License 1.0" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "Commercial" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "Check author's website" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "Binary Only" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "BSD,GPL because of list.h; other files released under BSD,GPL" cannot be mapped to valid SPDX identifier 2 WARNING: The license value "APLv2" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "specified in-file" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "see license.txt" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "see License.txt" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "free for research or education purpose, all rights maintained by David Applegate, William Cook, Sanjeeb Dash, and Monika Mevenkamp" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "free for academic research, for further licensing contact Wiliam Cook" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "WTF" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "Version 2.0" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "T.D.B" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "Slightech License" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "See license.txt" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "Lesser GPL and Apache License" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "LGPLv2.1, modified BSD" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "LGPL and Apache2" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "LGPL / BSD" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "GPL v2 with linking exception" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "GPL + runtime exception" cannot be mapped to valid SPDX identifier 1 WARNING: The license value "FZI all rights reserved" cannot be mapped to valid SPDX identifier Signed-off-by: Martin Jansa <martin.jansa@lge.com>
…s license * "Check-author\'s-website" license value used by uwsim_bullet breaks the parsing of superflore generated recipe, at least use it without apostrophe Signed-off-by: Martin Jansa <martin.jansa@lge.com>
eac6492
to
de3d007
Compare
It was in draft state, because I wanted to add separate warning for license values which include multiple licenses (instead of using multiple separate license tags). I've added another commit for that now and added description. Please let me know if it makes more sense to you now.
Are developers used to use https://github.com/fkie/catkin_lint ? Would showing warning about strange license texts in components from released distros be OK if done in catkin_lint? |
It's certainly not a hill I'm willing to die on, but in my opinion, the warning and error messages coming out of To be clear: I think this is an excellent thing to do, I'm just not convinced |
I'm working on very similar PR for catkin_lint as well now :). I only work with ROS when working on meta-ros, I've never used ROS as a regular developer, so I don't know what's the usual work flow for people who write package.xml. If it's common that they run catkin_lint (or that Github Actions runs it for them), then I agreed that's definitely the place where it should be validated. |
* when the license value lists multiple licenses (one of currenly used combinations) * show a link to REP-0149 format definition which explicitly says: "For multiple licenses multiple separate tags must be used." but unfortunately it doesn't define how to express e.g. dual-license or other more complicated scheme, other than pointing to <description> tag. "For any explanatory text about licensing caveats, please use the <description> tag." * e.g. OpenEmbedded supports '&' '|' '(' ')' to express more compilcated scheme: http://git.openembedded.org/openembedded-core/tree/meta/lib/oe/license.py?id=8e2d0575e4e7036b5f60e632f377a8ab2b96ead8#n42 ) but that would require change to Package Manifest Format Signed-off-by: Martin Jansa <martin.jansa@lge.com>
de3d007
to
21a07b3
Compare
@cottsay wrote:
@shr-project wrote:
can't speak for everyone, but I have quite a few |
This PR was motivated by this issue in superflore:
ros-infrastructure/superflore#271
superflore tries to unify at least some values to more common license identifier, but because https://www.ros.org/reps/rep-0149.html#license-multiple-but-at-least-one isn't very strict about the values and people got quite creative here and the regular expressions in superflore are in some cases "inventing" extra information like mapping "LGPL" to "LGPL-2" and in worse cases plain wrong mapping "LGPLv3" to "LGPL-2".
Superflore or any other tool, cannot guess what LGPL version developer had in mind when it isn't specified in the package.xml. Because it would need to at least investigate the actual sources for license headers there.
Instead of trying to improve the license value later, we need to warn the developer when the provided value cannot be unambiguously parsed and ask him to improve it in the package.xml in the source repo. To help with that we can show the correct value (of corresponding SPDX identifier) we've assigned from the provided license value.
And to make sure that these assigned SPDX identifiers don't do any guessing or match to too wide range of values (like in superflore) lets map them by exact string from the values currently being used in any component currently listed in rosdistro index. That way these assignments are easy to maintain and not error-prone.
Then we can take the same set of exact assignments from here and add it "temporarily" in superflore, that way the licenses which are currently incorrectly guessed will be fixed while not regressing to completely free form of license values. This is implemented in ros-infrastructure/superflore#279
rosdistro CI validation already checks for license tag existence, we can easily extend it to call this validation as well as implemented in: ros/rosdistro#26601 I've used this validation to easily get a list of all license values from all components currently listed in rosdistro index.
Some commonly used values (and some more creative):
...