Skip to content

Commit

Permalink
bring updates across from GTN
Browse files Browse the repository at this point in the history
  • Loading branch information
elichad committed Mar 26, 2024
1 parent 6743520 commit a87ec99
Showing 1 changed file with 25 additions and 13 deletions.
38 changes: 25 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ pip install .

### Creating an RO-Crate

In its simplest form, an RO-Crate is a directory tree with an `ro-crate-metadata.json` file at the top level that contains metadata about the other files and directories, represented by [data entities](https://www.researchobject.org/ro-crate/1.1/data-entities.html). These metadata consist both of properties of the data entities themselves and of other, non-digital entities called [contextual entities](https://www.researchobject.org/ro-crate/1.1/contextual-entities.html) (representing, e.g., a person or an organization).
In its simplest form, an RO-Crate is a directory tree with an `ro-crate-metadata.json` file at the top level. This file contains metadata about the other files and directories, represented by [data entities](https://www.researchobject.org/ro-crate/1.1/data-entities.html). These metadata consist both of properties of the data entities themselves and of other, non-digital entities called [contextual entities](https://www.researchobject.org/ro-crate/1.1/contextual-entities.html). A contextual entity can represent, for instance, a person, an organization or an event.

Suppose Alice and Bob worked on a research task together, which resulted in a manuscript written by both; additionally, Alice prepared a spreadsheet containing the experimental data, which Bob used to generate a diagram. For the purpose of this tutorial, you can just create dummy files for the documents:
Suppose Alice and Bob worked on a research task together, which resulted in a manuscript written by both; additionally, Alice prepared a spreadsheet containing the experimental data, which Bob used to generate a diagram. For the purpose of this tutorial, you can just create placeholder files for the documents:

```bash
mkdir exp
Expand Down Expand Up @@ -70,36 +70,46 @@ bob = crate.add(Person(crate, bob_id, properties={
}))
```

Next, we express authorship of the various files:
At this point, we have a representation of the various entities. Now we need to express the relationships between them. This is done by adding properties that reference other entities:

```python
paper["author"] = [alice, bob]
table["author"] = alice
diagram["author"] = bob
```

You can also add whole directories together with their contents. In an RO-Crate, a directory is represented by the `Dataset` entity. Create a directory with some placeholder files:

```bash
mkdir exp/logs
touch exp/logs/log1.txt
touch exp/logs/log2.txt
```

Now add it to the crate:

```python
logs = crate.add_dataset("exp/logs")
```

Finally, we serialize the crate to disk:

```python
crate.write("exp_crate")
```

Now the `exp_crate` directory should contain copies of the three files and an `ro-crate-metadata.json` file with a JSON-LD serialization of the entities and relationships we created, according to the RO-Crate profile. Note that we have chosen a different destination path for the diagram, while the other two files have been placed at the top level with their names unchanged (the default).
Now the `exp_crate` directory should contain copies of all the files we added and an `ro-crate-metadata.json` file with a [JSON-LD](https://json-ld.org) representation of the entities and relationships we created. Note that we have chosen a different destination path for the diagram, while the other two files have been placed at the top level with their names unchanged (the default).

Some applications and services support RO-Crates stored as archives. To save the crate in zip format, use `write_zip`:
Exploring the `exp_crate` directory, we see that all files and directories contained in `exp/logs` have been added recursively to the crate. However, in the `ro-crate-metadata.json` file, only the top level Dataset with `@id` `"exp/logs"` is listed. This is because we used `crate.add_dataset("exp/logs")` rather than adding every file individually. There is no requirement to represent every file and folder within the crate in the `ro-crate-metadata.json` file - in fact, if there were many files in the crate it would be impractical to do so.

```python
crate.write_zip("exp_crate.zip")
```
If you do want to add files and directories recursively to the metadata, use `crate.add_tree` instead of `crate.add_dataset` (but note that it only works on local directory trees).

You can also add whole directories. A directory in RO-Crate is represented by the `Dataset` entity:
Some applications and services support RO-Crates stored as archives. To save the crate in zip format, use `write_zip`:

```python
logs = crate.add_dataset("exp/logs")
crate.write_zip("exp_crate.zip")
```

Note that the above adds all files and directories contained in `"exp/logs"` recursively to the crate, but only the top-level `"exp/logs"` dataset itself is listed in the metadata file (there is no requirement to represent every file and folder in the JSON-LD). To also add files and directory recursively to the metadata, use `add_tree` (but note that it only works on local directory trees).

#### Appending elements to property values

What ro-crate-py entities actually store is their JSON representation:
Expand Down Expand Up @@ -255,7 +265,9 @@ https://orcid.org/0000-0000-0000-0000 Person
https://orcid.org/0000-0000-0000-0001 Person
```

The first two entities shown in the output are the [root data entity](https://www.researchobject.org/ro-crate/1.1/root-data-entity.html) and the [metadata file descriptor](https://www.researchobject.org/ro-crate/1.1/metadata.html), respectively. These are special entities managed by the `ROCrate` object, and are always present. The other entities are the ones we added in the [section on RO-Crate creation](#creating-an-ro-crate). You can access data entities with `crate.data_entities` and contextual entities with `crate.contextual_entities`. For instance:
The first two entities shown in the output are the [root data entity](https://www.researchobject.org/ro-crate/1.1/root-data-entity.html) and the [metadata file descriptor](https://www.researchobject.org/ro-crate/1.1/metadata.html), respectively. The former represents the whole crate, while the latter represents the metadata file. These are special entities managed by the `ROCrate` object, and are always present. The other entities are the ones we added in the [section on RO-Crate creation](#creating-an-ro-crate).

As shown above, `get_entities` allows to iterate over all entities in the crate. You can also access only data entities with `crate.data_entities` and only contextual entities with `crate.contextual_entities`. For instance:

```python
for e in crate.data_entities:
Expand Down

0 comments on commit a87ec99

Please sign in to comment.