Access Rights metadata in OpenAIRE metadata export is being misapplied #5920

jggautier · 2019-06-06T01:22:15Z

As part of v4.14 (released in May 2019), Dataverse makes available through the UI, API and over OAI-PMH DataCite metadata that complies with OpenAIRE requirements (#4257). Repositories need to follow these requirements in order for their dataset metadata to be made discoverable in OpenAIRE EXPLORE.

The required metadata export called OpenAIRE (in the Dataverse UI) or oai_datacite (over API and OAI-PMH) includes one of four Access Rights terms, which come from the info:eu-repo-Access-Terms vocabulary:

Open access
Restricted access
Closed access
Embargoed access

Dataverse chooses these terms based on whether or not any dataset files are set to restricted and whether or not people are able to request access to those restricted files using Dataverse's request access feature:

openAccess: If no files are set to restricted, the metadata export uses "openAccess"
restrictedAccess: If any of the files in the dataset are set to restricted and the option to request access is enabled (people are allowed to request access using Dataverse's request access feature), the metadata export uses "restrictedAccess"
closedAccess: If any of the files in the dataset are set to restricted and the option to request access is disabled, the metadata export uses "closedAccess"
embargoedAccess: Is not used because at the time, Dataverse had no way to tell if a dataset has an embargo

There are datasets in Dataverse repositories whose files are set to restricted, and people cannot request access through Dataverse's request access feature. The OpenAIRE metadata export for these datasets uses closedAccess, even when the dataset metadata indicates that people can request access by some process that happens outside of Dataverse's request access feature, e.g. submitting a DUA or contacting the author.

This dataset has restricted files and people aren't able to request access through Dataverse's request access feature, so its OpenAIRE metadata indicates that the dataset is closed access. But people are able to request access by filling out a form (Application For The Use of Data), so the dataset isn't really closed access.

When these datasets are harvested by OpenAIRE, because the metadata says they're closedAccess they'll appear and be searchable as closedAccess, grouped with datasets that are more appropriately labelled closedAccess, even though file access is only restricted. This may make these datasets harder to find and use, making OpenAIRE EXPLORE less effective for finding datasets published by Dataverse repositories.

We can think of better ways for Dataverse to assign rights access terms in ways that the Dataverse community thinks are more appropriate (e.g. Zenodo depositors choose from a drop-down menu). But other data publishers are using these rights access terms (or those terms are being applied to the harvested datasets) in a variety of ways that can make using the Access Rights filters unhelpful for searching through OpenAIRE EXPLORE. "Open data" already means many different things to different groups. Since these Access Rights terms are used for the benefit of finding data in OpenAIRE EXPLORE, the scope of this issue might involve learning how OpenAIRE might want to improve the definitions and how repositories can use them in more standardized ways.

jggautier · 2019-07-30T14:44:12Z

I wonder if it might be safe to never use "Closed Access", use "Restricted Access" for datasets that have restricted files, and use "Open Access" for all other datasets. Does anyone ever publish datasets whose files can't be accessed at all?

If so, it might help if Dataverse allows depositors to indicate, in a standardized and machine-readable way, that access to restricted files can be requested (even if people need to request access outside of Dataverse's request access feature) or cannot be requested through any means

cmbz · 2024-08-20T15:22:23Z

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

philippconzett · 2024-08-30T05:38:16Z

I only recently came aware of this issue. I think resolving this issue eventually depends on #4391 being resolved first. Thus, to me, it seems the Access Rights terms used by OpenAIRE and others (e.g., BASE Bielefeld) depend on Terms of Use being defined at file-level.

With support for file-level Terms of Use being implemented, I think things would work like this: At the metadata record level, thus the registered metadata at dataset or file-level should always be licensed with CC0 and thus have the Access Rights terms defined as "Open access". At file-level, all of the values can be used, based on the Terms of Use of the individual file at stake:

openAccess: If the file is not set to restricted or embargoed, the metadata export at file-level should use "openAccess".
restrictedAccess: If the file is set to restricted and the option to request access is enabled (people are allowed to request access using Dataverse's request access feature), the metadata export at file-level should use "restrictedAccess".
closedAccess: If the file is set to restricted and the option to request access is disabled, the metadata export at file-level should use "closedAccess".
embargoedAccess: If the file is set to embargoed, the metadata export at file-level should use "embargoedAccess".

jggautier · 2024-09-09T15:56:44Z

@pdurbin and I talked about this issue in relation to #10737 and #8129. And I agreed that I'd open a new GitHub issue about dc:rights specifically, to help manage these different goals and scopes.

But @philippconzett, what do you think of using this GitHub issue instead, since we're already talking about the use of these "Access Rights terms used by OpenAIRE and others (e.g., BASE Bielefeld)"?

I could re-word this GitHub issue's title so it's clear that the issue is about all uses of these "Access Rights" terms, and edit the first comment for the same reason.

pdurbin · 2024-09-09T20:55:17Z

I wanted to link to something so I went ahead with the idea that this issue represents the unfinished dc:rights work that was originally part of the scope of #8129, which (if all goes will) will be closed by PR #10737.

The next challenge will be to size it, of course, and figure out what the plan is and when. 😅

philippconzett · 2024-09-10T05:58:16Z

@jggautier @pdurbin Thanks for moving this forward. I think both approaches could work, thus continuing using this issue or creating a new one.

jggautier · 2024-09-10T20:43:26Z

Thanks. #4176 is also about changes to what's included in dc:rights and we'll need to consider the points raised there, too.

Next week I'll try to find time to help think about either using this GitHub issue or creating a new one, but with other projects and work travel next week, I'm not sure. I definitely don't have time this week.

jggautier · 2024-10-30T19:03:00Z

So I definitely didn't have time "next week" lol. I'm going to try to sneak some time in today to continue the discussion.

I'm going to keep using this GitHub issue for discussion about how access rights metadata in OpenAIRE metadata is being misapplied.

@philippconzett, I have questions and comments about what you wrote:

I only recently came aware of this issue. I think resolving this issue eventually depends on #4391 being resolved first. Thus, to me, it seems the Access Rights terms used by OpenAIRE and others (e.g., BASE Bielefeld) depend on Terms of Use being defined at file-level.

With support for file-level Terms of Use being implemented, I think things would work like this: At the metadata record level, thus the registered metadata at dataset or file-level should always be licensed with CC0 and thus have the Access Rights terms defined as "Open access". At file-level, all of the values can be used, based on the Terms of Use of the individual file at stake:

openAccess: If the file is not set to restricted or embargoed, the metadata export at file-level should use "openAccess".

restrictedAccess: If the file is set to restricted and the option to request access is enabled (people are allowed to request access using Dataverse's request access feature), the metadata export at file-level should use "restrictedAccess".

closedAccess: If the file is set to restricted and the option to request access is disabled, the metadata export at file-level should use "closedAccess".

embargoedAccess: If the file is set to embargoed, the metadata export at file-level should use "embargoedAccess".

OpenAIRE uses their OpenAIRE standard to determine if a dataset is openAccess, restrictedAccess, closedAccess or embargoedAccess.

It sounds like you're proposing that the OpenAIRE XML exports of datasets would always indicate that the metadata of the dataset is CC0 and "openAccess". Am I understanding that right?

If so, as far as I know, the OpenAIRE standard doesn't have a way to indicate the terms or license of the metadata. As we know, it includes a way to indicate the license or terms of the data that the metadata describes, and I think that's all it can do.

And I think that being able to describe the terms or license of the metadata of the dataset is out of scope here. OpenAIRE's system wants to know the access level of the data in the dataset, using just one of those four access levels. This GitHub issue is about challenges with providing that information to OpenAIRE. The use case I described in this issue's first post assumes that a dataset can be usefully described with just one of the four access levels.

But that model doesn't work when one dataset has data with multiple access levels right? I think that's the gist of your comments. And if we want to resolve that, then I think it means also working with the OpenAIRE folks so that their systems can support searching for datasets by access level when those datasets have multiple access levels because the datasets' files have multiple access levels.

Does all of the make sense? I'd like to make sure before we start thinking about solutions.

jggautier added Feature: Metadata Feature: Terms & Licensing labels Jun 6, 2019

jggautier mentioned this issue Jun 6, 2019

Align or merge DataCite metadata exports #5889

Open

pdurbin mentioned this issue Aug 6, 2019

Embargo: I want to set an embargo period to control when my data will be accessible. #4052

Closed

jggautier mentioned this issue Oct 22, 2020

Improve/update Schema.org JSON-LD export #7349

Closed

pdurbin added the Feature: Harvesting label Apr 12, 2022

pdurbin mentioned this issue Apr 13, 2022

Spike: Inventory and prioritize all existing Harvesting related issues IQSS/dataverse-pm#24

Closed

3 tasks

jggautier mentioned this issue Oct 12, 2022

Improving Dataverse's Schema.org JSON-LD schema to enable author names display in Google Dataset Search's #5029

Closed

cmbz mentioned this issue Mar 12, 2024

GREI 3: HDV Task - Improve OAI-PMH Harvesting IQSS/dataverse-pm#171

Open

56 tasks

jggautier mentioned this issue Mar 28, 2024

Change Dataverse / Dublin Core mapping to improve OAI-PMH harvesting #8129

Closed

cmbz closed this as completed Aug 20, 2024

jggautier mentioned this issue Aug 28, 2024

Remap oai_dc fields dc:type and dc:date #10737

Merged

jggautier reopened this Aug 30, 2024

jggautier mentioned this issue Sep 11, 2024

Feature Request: Metadata field for embargoed datasets #10833

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Access Rights metadata in OpenAIRE metadata export is being misapplied #5920

Access Rights metadata in OpenAIRE metadata export is being misapplied #5920

jggautier commented Jun 6, 2019 •

edited

Loading

jggautier commented Jul 30, 2019 •

edited

Loading

cmbz commented Aug 20, 2024

philippconzett commented Aug 30, 2024

jggautier commented Sep 9, 2024 •

edited

Loading

pdurbin commented Sep 9, 2024

philippconzett commented Sep 10, 2024

jggautier commented Sep 10, 2024 •

edited

Loading

jggautier commented Oct 30, 2024

Access Rights metadata in OpenAIRE metadata export is being misapplied #5920

Access Rights metadata in OpenAIRE metadata export is being misapplied #5920

Comments

jggautier commented Jun 6, 2019 • edited Loading

jggautier commented Jul 30, 2019 • edited Loading

cmbz commented Aug 20, 2024

philippconzett commented Aug 30, 2024

jggautier commented Sep 9, 2024 • edited Loading

pdurbin commented Sep 9, 2024

philippconzett commented Sep 10, 2024

jggautier commented Sep 10, 2024 • edited Loading

jggautier commented Oct 30, 2024

jggautier commented Jun 6, 2019 •

edited

Loading

jggautier commented Jul 30, 2019 •

edited

Loading

jggautier commented Sep 9, 2024 •

edited

Loading

jggautier commented Sep 10, 2024 •

edited

Loading