Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10286 add owner info #10322

Merged
merged 30 commits into from
Feb 28, 2024
Merged

10286 add owner info #10322

merged 30 commits into from
Feb 28, 2024

Conversation

sekmiller
Copy link
Contributor

What this PR does / why we need it: Adds info about the owners of an object for use by the SPA to display breadcrumbs

Which issue(s) this PR closes:

Closes #10286 Extend the API to include owner information for dv objects

Special notes for your reviewer:
I added a boolean of "includeOwners" to the json printers of the various objects. Per a suggestion from Guillermo I didn't include the object itself in the owners array.

Suggestions on how to test this: add a query param of "returnOwners=true" to any of the get object apis. You should see an array of each owning object up to the root dv collection.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@coveralls
Copy link

coveralls commented Feb 14, 2024

Coverage Status

coverage: 20.27% (-0.005%) from 20.275%
when pulling 9790d23 on 10286-add-owner-info
into de45c13 on develop.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@GPortas GPortas added pm.GREI-d-2.7.1 NIH, yr2, aim7, task1: R&D UI modules for creating datasets and supporting publishing workflows pm.GREI-d-2.7.2 NIH, yr2, aim7, task2: Implement UI modules for creating datasets and publishing workflows SPA These changes are required for the Dataverse SPA labels Feb 15, 2024
src/main/java/edu/harvard/iq/dataverse/api/Datasets.java Outdated Show resolved Hide resolved
}
if (dvo.isInstanceofDataverse()){
Dataverse in = (Dataverse) dvo;
ownerObject.add("identifier", in.getAlias());
Copy link

@MellyGray MellyGray Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be beneficial to include the version as well to the ownerObject, because for a file, its owner might be the dataset doi:10.5072/FK2/WTBMGC with version=2.0, but its owner is not the same dataset at an earlier version such as 1.0, because the file was added in a later version. I hope I'm making myself clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the logic for including a version number, but since we're omitting the base object from the isPartOf it can only apply to Files, however there's no endpoint for getting a file by its version like there is for dataset - the current result is that if I use the getFileData (@path("{id}")) it gets me the latest available version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MellyGray @GPortas I think I need #10280 to be completed before I can finish this the way it needs to be finished.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sekmiller , I was testing the last changes. identifier includes both the identifier and the version, this can complicate parsing and utilizing these distinct pieces of information separately. I suggest separating the version information from the DOI identifier

{
  "type": DvObjectType.DATASET,
  "identifier": "doi:10.5072/FK2/HEGZLV",
  "version": "DRAFT",  // New dedicated property for version information
  "displayName": "First Dataset",
  "isPartOf": {
    "type": DvObjectType.DATAVERSE,
    "identifier": "root",
    "displayName": "Root"
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I can do that. The reason I had put them together is that the url for the dataset page currently looks like this: http://localhost:8080/dataset.xhtml?persistentId=doi:10.5072/FK2/UE0DKV&version=2.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, @sekmiller! I see where you're coming from, but we're aiming to follow an API-first approach to decouple the API implementation from the UI use cases, ensuring our API remains flexible for various applications. We don't know who is going to consume this endpoint, so we should keep these two pieces of information separate to allow them to be used independently. Thanks for implementing the suggestion

@MellyGray MellyGray removed their assignment Feb 20, 2024
return response( req -> {
final Dataset retrieved = execCommand(new GetDatasetCommand(req, findDatasetOrDie(id)));
final DatasetVersion latest = execCommand(new GetLatestAccessibleDatasetVersionCommand(req, retrieved));
final JsonObjectBuilder jsonbuilder = json(retrieved);
Boolean includeOwners = returnOwners == null ? false : returnOwners;
Copy link
Contributor

@GPortas GPortas Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly use the primitive boolean type: @QueryParam("returnOwners") boolean returnOwners

And thus avoid the null check. If the query param is not set in the URL, false is the default value.

Same for other endpoints in this PR.

@@ -610,10 +610,12 @@ private Dataset parseDataset(String datasetJson) throws WrappedResponse {
@GET
@AuthRequired
@Path("{identifier}")
public Response viewDataverse(@Context ContainerRequestContext crc, @PathParam("identifier") String idtf) {
public Response viewDataverse(@Context ContainerRequestContext crc, @PathParam("identifier") String idtf, @QueryParam("returnOwners") Boolean returnOwners) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with method naming in other places, why not rename the endpoint method from "viewDataverse" to "getDataverse"?

if (dvo.isInstanceofDataset()){
ownerObject.add("type", "DATASET");
}
if (dvo.isInstanceofDataFile()){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an indentation issue in this line


for (DvObject dvo : ownerList){
JsonObjectBuilder ownerObject = jsonObjectBuilder();
if (dvo.isInstanceofDataverse()){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would extract the type check to a separate method to improve readability, and try to use a switch statement instead of these ifs. As far as I know a dvObject can't be of more than one type, and by having these ifs without else ifs it seems that a dvObject can have multiple types which is confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified it here because since we're ignoring the object itself we can only have a Dataverse or a Dataset

This comment has been minimized.

@sekmiller sekmiller assigned GPortas and unassigned sekmiller Feb 26, 2024

This comment has been minimized.

2 similar comments

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@GPortas GPortas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for applying the requested changes.

Although overall it looks good, I have left a couple of comments.

I have tested the following endpoints and they work as expected:

Files
curl "http://localhost:8080/api/v1/files/5?returnOwners=true"

The payload includes the isPartOf object, which looks good:

"isPartOf":{
  "type":"DATASET",
  "persistentIdentifier":"doi:10.5072/FK2/JSKMZS",
  "identifier":2,
  "version":"1.0",
  "displayName":"test",
  "isPartOf":{
     "type":"DATAVERSE",
     "identifier":"root",
     "displayName":"Root"
  }
}

Dataset version
curl "http://localhost:8080/api/v1/datasets/2/versions/:latest?includeFiles=false&returnOwners=true"

The isPartOf object looks good:

"isPartOf": {
      "type": "DATAVERSE",
      "identifier": "root",
      "displayName": "Root"
},

Collections

First, I requested the root collection (curl "http://localhost:8080/api/v1/dataverses/root?returnOwners=true").

No owners retrieved, which makes sense.

Then, I created a sub collection root called "child", and a sub collection in "child" called "child1", and called the endpoint for "child1". (curl "http://localhost:8080/api/v1/dataverses/child1?returnOwners=true")

The isPartOf looks good:

"isPartOf": {
      "type": "DATAVERSE",
      "identifier": "child",
      "displayName": "child",
      "isPartOf": {
        "type": "DATAVERSE",
        "identifier": "root",
        "displayName": "Root"
      }
    },

@@ -421,6 +421,7 @@ public Response getVersion(@Context ContainerRequestContext crc,
@PathParam("versionId") String versionId,
@QueryParam("excludeFiles") Boolean excludeFiles,
@QueryParam("includeDeaccessioned") boolean includeDeaccessioned,
@QueryParam("returnOwners") boolean includeOwners,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename the parameter includeOwners to returnOwners to match the query param, as in other endpoints.

@@ -610,10 +610,12 @@ private Dataset parseDataset(String datasetJson) throws WrappedResponse {
@GET
@AuthRequired
@Path("{identifier}")
public Response viewDataverse(@Context ContainerRequestContext crc, @PathParam("identifier") String idtf) {
public Response getDataverse(@Context ContainerRequestContext crc, @PathParam("identifier") String idtf, @QueryParam("returnOwners") Boolean returnOwners) {
Boolean includeOwners = returnOwners == null ? false : returnOwners;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid this using boolean primitive, as I mentioned in the first review.

public static JsonObjectBuilder json(DatasetVersion dsv, List<String> anonymizedFieldTypeNamesList, boolean includeFiles, boolean includeOwners) {
/* return json(dsv, null, includeFiles, null);
}
public static JsonObjectBuilder json(DatasetVersion dsv, List<String> anonymizedFieldTypeNamesList, boolean includeFiles, Long numberOfFiles) {*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this commented code?

.body("data.dataFile.filesize", equalTo(8361))
.statusCode(OK.getStatusCode());

getFileDataResponse.then().assertThat().body("data.dataFile.isPartOf.identifier", equalTo(datasetId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an assertion to check the persistentIdentifier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed this one, sorry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. it's in now...

.. code-block:: bash

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/files/:persistentId/versions/:draft?persistentId=doi:10.5072/FK2/J8SJZB&returnOwners=true"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs for getPrivateUrlDatasetVersion are missing, if I am not wrong.

@GPortas GPortas assigned sekmiller and unassigned GPortas Feb 27, 2024

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

Copy link

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:10286-add-owner-info
ghcr.io/gdcc/configbaker:10286-add-owner-info

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@sekmiller sekmiller assigned GPortas and unassigned sekmiller Feb 27, 2024
Copy link
Contributor

@GPortas GPortas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for applying the changes @sekmiller.

@GPortas GPortas removed their assignment Feb 27, 2024
@jp-tosca jp-tosca self-assigned this Feb 28, 2024
@jp-tosca jp-tosca merged commit dcd7c22 into develop Feb 28, 2024
12 checks passed
@pdurbin pdurbin added this to the 6.2 milestone Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pm.GREI-d-2.7.1 NIH, yr2, aim7, task1: R&D UI modules for creating datasets and supporting publishing workflows pm.GREI-d-2.7.2 NIH, yr2, aim7, task2: Implement UI modules for creating datasets and publishing workflows SPA These changes are required for the Dataverse SPA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend the API to retrieve owner info for files, datasets and collections
7 participants