diff --git a/README.md b/README.md index 2fcbc79..be61970 100644 --- a/README.md +++ b/README.md @@ -22,9 +22,9 @@ pip install . ### Creating an RO-Crate -In its simplest form, an RO-Crate is a directory tree with an `ro-crate-metadata.json` file at the top level that contains metadata about the other files and directories, represented by [data entities](https://www.researchobject.org/ro-crate/1.1/data-entities.html). These metadata consist both of properties of the data entities themselves and of other, non-digital entities called [contextual entities](https://www.researchobject.org/ro-crate/1.1/contextual-entities.html) (representing, e.g., a person or an organization). +In its simplest form, an RO-Crate is a directory tree with an `ro-crate-metadata.json` file at the top level. This file contains metadata about the other files and directories, represented by [data entities](https://www.researchobject.org/ro-crate/1.1/data-entities.html). These metadata consist both of properties of the data entities themselves and of other, non-digital entities called [contextual entities](https://www.researchobject.org/ro-crate/1.1/contextual-entities.html). A contextual entity can represent, for instance, a person, an organization or an event. -Suppose Alice and Bob worked on a research task together, which resulted in a manuscript written by both; additionally, Alice prepared a spreadsheet containing the experimental data, which Bob used to generate a diagram. For the purpose of this tutorial, you can just create dummy files for the documents: +Suppose Alice and Bob worked on a research task together, which resulted in a manuscript written by both; additionally, Alice prepared a spreadsheet containing the experimental data, which Bob used to generate a diagram. For the purpose of this tutorial, you can just create placeholder files for the documents: ```bash mkdir exp @@ -70,7 +70,7 @@ bob = crate.add(Person(crate, bob_id, properties={ })) ``` -Next, we express authorship of the various files: +At this point, we have a representation of the various entities. Now we need to express the relationships between them. This is done by adding properties that reference other entities: ```python paper["author"] = [alice, bob] @@ -78,28 +78,38 @@ table["author"] = alice diagram["author"] = bob ``` +You can also add whole directories together with their contents. In an RO-Crate, a directory is represented by the `Dataset` entity. Create a directory with some placeholder files: + +```bash +mkdir exp/logs +touch exp/logs/log1.txt +touch exp/logs/log2.txt +``` + +Now add it to the crate: + +```python +logs = crate.add_dataset("exp/logs") +``` + Finally, we serialize the crate to disk: ```python crate.write("exp_crate") ``` -Now the `exp_crate` directory should contain copies of the three files and an `ro-crate-metadata.json` file with a JSON-LD serialization of the entities and relationships we created, according to the RO-Crate profile. Note that we have chosen a different destination path for the diagram, while the other two files have been placed at the top level with their names unchanged (the default). +Now the `exp_crate` directory should contain copies of all the files we added and an `ro-crate-metadata.json` file with a [JSON-LD](https://json-ld.org) representation of the entities and relationships we created. Note that we have chosen a different destination path for the diagram, while the other two files have been placed at the top level with their names unchanged (the default). -Some applications and services support RO-Crates stored as archives. To save the crate in zip format, use `write_zip`: +Exploring the `exp_crate` directory, we see that all files and directories contained in `exp/logs` have been added recursively to the crate. However, in the `ro-crate-metadata.json` file, only the top level Dataset with `@id` `"exp/logs"` is listed. This is because we used `crate.add_dataset("exp/logs")` rather than adding every file individually. There is no requirement to represent every file and folder within the crate in the `ro-crate-metadata.json` file - in fact, if there were many files in the crate it would be impractical to do so. -```python -crate.write_zip("exp_crate.zip") -``` +If you do want to add files and directories recursively to the metadata, use `crate.add_tree` instead of `crate.add_dataset` (but note that it only works on local directory trees). -You can also add whole directories. A directory in RO-Crate is represented by the `Dataset` entity: +Some applications and services support RO-Crates stored as archives. To save the crate in zip format, use `write_zip`: ```python -logs = crate.add_dataset("exp/logs") +crate.write_zip("exp_crate.zip") ``` -Note that the above adds all files and directories contained in `"exp/logs"` recursively to the crate, but only the top-level `"exp/logs"` dataset itself is listed in the metadata file (there is no requirement to represent every file and folder in the JSON-LD). To also add files and directory recursively to the metadata, use `add_tree` (but note that it only works on local directory trees). - #### Appending elements to property values What ro-crate-py entities actually store is their JSON representation: @@ -255,7 +265,9 @@ https://orcid.org/0000-0000-0000-0000 Person https://orcid.org/0000-0000-0000-0001 Person ``` -The first two entities shown in the output are the [root data entity](https://www.researchobject.org/ro-crate/1.1/root-data-entity.html) and the [metadata file descriptor](https://www.researchobject.org/ro-crate/1.1/metadata.html), respectively. These are special entities managed by the `ROCrate` object, and are always present. The other entities are the ones we added in the [section on RO-Crate creation](#creating-an-ro-crate). You can access data entities with `crate.data_entities` and contextual entities with `crate.contextual_entities`. For instance: +The first two entities shown in the output are the [root data entity](https://www.researchobject.org/ro-crate/1.1/root-data-entity.html) and the [metadata file descriptor](https://www.researchobject.org/ro-crate/1.1/metadata.html), respectively. The former represents the whole crate, while the latter represents the metadata file. These are special entities managed by the `ROCrate` object, and are always present. The other entities are the ones we added in the [section on RO-Crate creation](#creating-an-ro-crate). + +As shown above, `get_entities` allows to iterate over all entities in the crate. You can also access only data entities with `crate.data_entities` and only contextual entities with `crate.contextual_entities`. For instance: ```python for e in crate.data_entities: