Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Means to specify database license #414

Merged
merged 13 commits into from
Jun 29, 2022

Conversation

merkys
Copy link
Member

@merkys merkys commented Jun 2, 2022

Fixes #102. After discussions with @rartino in #102 I became convinced that for now a single license for all the data in a database should be sufficient. Should there be parts of database belonging under a different license, they should either be described in database-wide license document, or moved to a separate "sibling" database under a different license.

Copy link
Contributor

@sauliusg sauliusg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stanza says "all data", but shouldn't we include also "all metadata"?
I was slightly inclined to have a possibility to specify license per record, but I agree that this wold be unnecessary complication for now. We can always upgrade if there is a genuine need.

I would approve the PR if/when the question about metadata is resolved.

@merkys
Copy link
Member Author

merkys commented Jun 2, 2022

The stanza says "all data", but shouldn't we include also "all metadata"? I was slightly inclined to have a possibility to specify license per record, but I agree that this wold be unnecessary complication for now. We can always upgrade if there is a genuine need.

I would approve the PR if/when the question about metadata is resolved.

@sauliusg I have added this in 3a678c3.

@merkys merkys requested a review from sauliusg June 2, 2022 11:49
sauliusg
sauliusg previously approved these changes Jun 2, 2022
@merkys
Copy link
Member Author

merkys commented Jun 2, 2022

This is an important change (please note MUST level of inclusion), thus I would like to make sure that providers-consortium members check it out and let us know should there be a problem with conforming to it.

@merkys merkys added topic/property-standardization The specification of the precise data representation of properties and entries type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus. PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing labels Jun 2, 2022
@merkys
Copy link
Member Author

merkys commented Jun 3, 2022

I have updated the PR to reflect workshop discussions: license_is_compatible_with_cc_by_4_0 has been replaced by compatible_licenses list of SPDX identifiers. The new property is OPTIONAL.

@blokhin
Copy link
Member

blokhin commented Jun 3, 2022

For the case of the MPDS, we might have multiple licenses for the different sections of our data: CC BY 4.0, commercial/proprietary, per-vendor custom license, etc. We currently use per-entry custom field _mpds_data_license, which I would feel strong to recommend as a standard data_license / entry_license field.

@merkys
Copy link
Member Author

merkys commented Jun 6, 2022

For the case of the MPDS, we might have multiple licenses for the different sections of our data: CC BY 4.0, commercial/proprietary, per-vendor custom license, etc. We currently use per-entry custom field _mpds_data_license, which I would feel strong to recommend as a standard data_license / entry_license field.

This is a scenario quite closely fitting my reasoning which I have presented in #102 in my discussions with @rartino. If MPDS uses per-entry custom field, why not promote it to the standard? But this probably could be introduced in a follow-up PR in order not to block the current one.

According to this PR, MPDS licensing situation could be solved in two ways:

  1. Separate databases (one per each license);
  2. Specify all licenses and their governed domains in the top-level license file.

@blokhin Is any of these solutions suitable for MPDS? If so, maybe per-entry licensing could wait for the follow-up PR?

Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some input on the overall design.

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
@rartino
Copy link
Contributor

rartino commented Jun 7, 2022

@blokhin Even with your licensing situation, I hope that you do not have anything against merging this PR that allows linking a database-wide license text, where you can clarify the maybe not so simple licensing situation in MPDS? At the same time, perhaps we can continue the discussion of separate licenses for individual entries in #102? I'll copy your comment there to continue that discussion.

@rartino rartino mentioned this pull request Jun 7, 2022
optimade.rst Outdated
Comment on lines 1021 to 1022
- **license**: A `JSON API links object <http://jsonapi.org/format/1.0/#document-links>`__ giving a pointer to a license covering all the data and metadata provided under this database.
If the license is included in the `SPDX License List <https://spdx.org/licenses/>`__, then the URL MUST point to SPDX-hosted license fulltext, i.e., it MUST start with :field-val:`https://spdx.org/licenses/`, but MUST NOT include the terminating :field-val:`.html`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be more databases besides MPDS that use a per entry licencing system. So I think it would be good to specify that in that case, the link should point to a document describing how to find the licence for each entry. Otherwise someone may fill in a null value here, because there is no server wide licence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest resolving this by explicitly stating that there is no default value/interpretation of null for these licensing properties. In case no pointers are provided in license (or *_licenses), this means no rights are granted for the client (as is with the source code). I do not want to entangle too much logic around this PR - there is a field for provider to put a link to their license/policies, and it is up to a provider to describe their license/policies in it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, with the latest edits, the text for license says that the text it points to may express "(or licensing options if there are multiple)", which I think is the appropriate level of clarification about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe we need to explicitly state what missing/null value for license means too?

Copy link
Contributor

@rartino rartino Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it does not hurt to clarify for both the fields that null/missing means "no license". But, I suppose, in particular available_licenses present, and license missing should still be interpretable.

@rartino
Copy link
Contributor

rartino commented Jun 17, 2022

@merkys This seems a bit stalled? To move things forward, if you disagree with my suggestions, but don't have time to debate this, perhaps we can simplify the PR back to just a single "free-form" link without any machine-readable version and add the machine readable version in a separate PR?

merkys and others added 2 commits June 17, 2022 16:56
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
@merkys
Copy link
Member Author

merkys commented Jun 17, 2022

@merkys This seems a bit stalled? To move things forward, if you disagree with my suggestions, but don't have time to debate this, perhaps we can simplify the PR back to just a single "free-form" link without any machine-readable version and add the machine readable version in a separate PR?

Yes, I wanted to give some more time for opinions like @blokhin's. But sure, if we can agree on a common subset of this PR, I am fine with stepping back. I would like to have licensing sorted out before the next release, one way or another.

@blokhin
Copy link
Member

blokhin commented Jun 17, 2022

I'm okay with the database-wide available_licenses / license fields. But what do you think about the optional per-entry license field?

@rartino
Copy link
Contributor

rartino commented Jun 17, 2022

@blokhin

My opinion on the per-entry field is that it is best kept as a database-specific field, because it should always have a database-specific meaning defined in the global license. As soon as we standardize a license field, rather than _mdps_license, it means both servers and clients can try to give it meaning outside of its database-specific definition in the main license info link. And IMO they shouldn't.

Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adjustments I think are needed to address my other point -- i.e., that the link really should be a free link with no requirement to be computer parsed, since available_licenses takes that role.

(I'm happy to hear contradicting opinions on this, though - I really don't mean to "steamroll" this PR, just make sure it is will work in a way that can be handled semi-automatically by my client implementation...)

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
merkys and others added 2 commits June 20, 2022 11:01
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great if @sauliusg has a chance to take a look at the present version before we merge it, since I think he has a good handle on the formalities around licensing. I also recall @giovannipizzi having some thoughts/input on this on the workshop.

@merkys merkys requested a review from blokhin June 20, 2022 08:16
@rartino
Copy link
Contributor

rartino commented Jun 29, 2022

This has sitting approved for a week waiting for further input. In the interest of cleaning up our PRs, I'm going to merge it.

@rartino rartino merged commit f1e656e into Materials-Consortia:develop Jun 29, 2022
@merkys merkys deleted the database-license branch June 29, 2022 06:12
@ml-evs ml-evs added this to the v1.2 milestone Dec 6, 2022
Inclusion of a license identifier in the list is a commitment of the database that the rights are in place to grant clients access to all the data and metadata according to the terms of either of these licenses (at the choice of the client).
If the licensing information provided via the field :field:`license` omits licensing options specified in :field:`available_licenses`, or if it otherwise contradicts them, a client MUST still be allowed to interpret the inclusion of a license in :field:`available_licenses` as a full commitment from the database that the data and metadata is available, without exceptions, under the respective licenses.
If the database cannot make that commitment, e.g., if only part of the data is available under a license, the corresponding license identifier MUST NOT appear in :field:`available_licenses` (but, rather, the field :field:`license` is to be used to clarify the licensing situation.)
An empty list indicates that none of the SPDX licenses apply for the entirety of the database and that the licensing situation is clarified in human readable form in the field :field:`license`.
Copy link
Member

@ml-evs ml-evs Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without opening a new issue, I just wanted to clarify something here @merkys (and others). The field license is simply a Links object, so when you say "situtation is clarified in human readable form in the field license", are you referring to the content served at the URL pointed to in license->href, or the meta section of the Links object?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember the discussion correctly, we were indeed referring to the content of the URL to which the licence link points.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is license->href indeed, worth clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing topic/property-standardization The specification of the precise data representation of properties and entries type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data(base) licenses
6 participants