-
Notifications
You must be signed in to change notification settings - Fork 3
Added documentation for datasets #280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
f236698
Updated dataset, including the following changes:
jesper-friis 94fa59a
Added new TableDoc class providing a table interface for data documen…
jesper-friis 028054f
Import indir/outdir inside test functions
jesper-friis ef5239a
Fixed doctest issue
jesper-friis 331878a
Skip test_tabledoc if rdflib isn't available
jesper-friis 5fe9cf7
More pylint fixes...
jesper-friis 4aaeed8
Placed importskip before importing EMMO
jesper-friis 0f21fbb
typo
jesper-friis 9e34414
Merge branch 'master' into tabledoc
jesper-friis 4cc88cb
Fixed pylint errors
jesper-friis 92b213d
added csv file
jesper-friis ae20a0a
Added csv parser
jesper-friis 543e99e
Updated the test
jesper-friis b3e3d07
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] 700c514
Fixed failing tests
jesper-friis 0320905
Merge branch 'tabledoc-csv' of github.com:EMMC-ASBL/tripper into tabl…
jesper-friis 4d7d77a
Added encoding to keyword arguments
jesper-friis 8004867
Strip off blanks when parsing a table.
jesper-friis 731253c
Added extra test to ensure that all properties are parsed correctly
jesper-friis 60b0c6d
Added write_csv() method to TableDoc
jesper-friis d26d92f
Save serialised documentation to turtle file.
jesper-friis 66b9dd7
Apply suggestions from code review
jesper-friis 575f09d
Apply suggestions from code review
jesper-friis f45376d
Added a clarifying comment as a responce to review comment by @torhaugl.
jesper-friis fa5a5c0
Merge branch 'tabledoc' into tabledoc-csv
jesper-friis 1752db0
Fix test failure
jesper-friis 33600be
Merge branch 'master' into tabledoc
jesper-friis 33181b5
Merge branch 'tabledoc' into tabledoc-csv
jesper-friis 9b53f5e
Added `context` argument to get_jsonld_context()
jesper-friis 26ee518
Added `context` argument to get_prefixes()
jesper-friis 568abd7
Added `context? argument to get_shortnames()
jesper-friis 2988a32
Updated .gitignore files
jesper-friis 36736e7
Merge branch 'tabledoc-csv' into dataset-todos
jesper-friis 841a74d
Added documentation for the dataset sub-package
jesper-friis 39c9c1a
Added return annotation to utils.openfile()
jesper-friis 4302dde
Try to avoid pytest failure during collection phase.
jesper-friis 8f727c7
Remove --ignore=examples from pytest options in pyproject.toml
jesper-friis 065e893
Fix CI doctest bug
torhaugl dd92304
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] 4241295
Use relative import from __init__.py file
jesper-friis 38e0483
Updated documentation
jesper-friis 8756727
Added types (literal/iri) to datadoc-keywords.md and reordered contex…
jesper-friis ff4d077
Separated the data documentation introduction into an own page.
jesper-friis c7709ae
Added a section about customisation to the documentation
jesper-friis 1a1cbad
Update docs/dataset/customisation.md
jesper-friis 3f30c00
Update docs/dataset/customisation.md
jesper-friis c69dd23
Documented custum context
jesper-friis 1583942
Merge branch 'dataset-docs' of github.com:EMMC-ASBL/tripper into data…
jesper-friis 5686804
Added example with custom context
jesper-friis abbef4b
Correct example
jesper-friis cfb2419
Merge branch 'master' into tabledoc-csv
jesper-friis 85a51ae
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] cf78a86
Merge branch 'tabledoc-csv' into dataset-todos
jesper-friis 9561f1f
Merge branch 'dataset-todos' into dataset-docs
jesper-friis b61b00c
Merge branch 'master' into dataset-todos
jesper-friis 13d43cf
Merge branch 'dataset-todos' into dataset-docs
jesper-friis d2d9618
Merge branch 'master' into dataset-docs
jesper-friis e054557
Removed duplicated test
jesper-friis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # triplestore_extend | ||
|
|
||
| ::: tripper.triplestore_extend |
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,215 @@ | ||
| Customisations | ||
| ============== | ||
|
|
||
|
|
||
| User-defined prefixes | ||
| --------------------- | ||
| A namespace prefix is a mapping from a *prefix* to a *namespace URL*. | ||
| For example | ||
|
|
||
| owl: http://www.w3.org/2002/07/owl# | ||
|
|
||
| Tripper already include a default list of [predefined prefixes]. | ||
| Additional prefixed can be provided in two ways. | ||
|
|
||
| ### With the `prefixes` argument | ||
| Several functions in the API (like [save_dict()], [as_jsonld()] and [TableDoc.parse_csv()]) takes a `prefixes` argument with which additional namespace prefixes can provided. | ||
|
|
||
| This may be handy when used from the Python API. | ||
|
|
||
|
|
||
| ### With custom context | ||
| Additional prefixes can also be provided via a custom JSON-LD context as a `"prefix": "namespace URL"` mapping. | ||
|
|
||
| See [User-defined keywords] for how this is done. | ||
|
|
||
|
|
||
| User-defined keywords | ||
| --------------------- | ||
| Tripper already include a long list of [predefined keywords], that are defined in the [default JSON-LD context]. | ||
| A description of how to define new concepts in the JSON-LD context is given by [JSON-LD 1.1](https://www.w3.org/TR/json-ld11/) document, and can be tested in the [JSON-LD Playground](https://json-ld.org/playground/). | ||
|
|
||
| A new custom keyword can be added by providing mapping in a custom JSON-LD context from the keyword to the IRI of the corresponding concept in an ontology. | ||
|
|
||
| Lets assume that you already have a domain ontology with base IRI http://example.com/myonto#, that defines the concepts for the keywords you want to use for the data documentation. | ||
|
|
||
| First, you can add the prefix for the base IRI of your domain ontology to a custom JSON-LD context | ||
|
|
||
| "myonto": "http://example.com/myonto#", | ||
|
|
||
| How the keywords should be specified in the context depends on whether they correspond to a data property or an object property in the ontology and whether a given datatype is expected. | ||
|
|
||
| ### Simple literal | ||
| Simple literals keywords correspond to data properties with no specific datatype (just a plain string). | ||
|
|
||
| Assume you want to add the keyword `batchNumber` to relate documented samples to the number assigned to the batch they are taken from. | ||
| It corresponds to the data property http://example.com/myonto#batchNumber in your domain ontology. | ||
| By adding the following mapping to your custom JSON-LD context, `batchNumber` becomes available as a keyword for your data documentation: | ||
|
|
||
| "batchNumber": "myonto:batchNumber", | ||
|
|
||
| ### Literal with specific datatype | ||
| If `batchNumber` must always be an integer, you can specify this by replacing the above mapping with the following: | ||
|
|
||
| "batchNumber": { | ||
| "@id": "myonto:batchNumber", | ||
| "@type": "xsd:integer" | ||
| }, | ||
|
|
||
| Here "@id" refer to the IRI `batchNumber` is mapped to and "@type" its datatype. In this case we use `xsd:integer`, which is defined in the W3C `xsd` vocabulary. | ||
|
|
||
| ### Object property | ||
| Object properties are relations between two individuals in the knowledge base. | ||
|
|
||
| If you want to say more about the batches, you may want to store them as individuals in the knowledge base. | ||
| In that case, you may want to add a keyword `fromBatch` which relate your sample to the batch it was taken from. | ||
| In your ontology you may define `fromBatch` as a object property with IRI: http://example.com/myonto/fromBatch. | ||
|
|
||
|
|
||
| "fromBatch": { | ||
| "@id": "myonto:fromBatch", | ||
| "@type": "@id" | ||
| }, | ||
|
|
||
| Here the special value "@id" for the "@type" means that the value of `fromBatch` must be an IRI. | ||
|
|
||
|
|
||
| Providing a custom context | ||
| -------------------------- | ||
| Custom context can be provided for all the interfaces described in the section [Documenting a resource]. | ||
|
|
||
| ### Python dict | ||
| Both for the single-resource and multi-resource dicts, you can add a `"@context"` key to the dict who's value is | ||
| - a string containing a resolvable URL to the custom context, | ||
| - a dict with the custom context or | ||
| - a list of the aforementioned strings and dicts. | ||
|
|
||
| ### YAML file | ||
| Since the YAML representation is just a YAML serialisation of a multi-resource dict, custom context can be provided by adding a `"@context"` keyword. | ||
|
|
||
| For example, the following YAML file defines a custom context defining the `myonto` prefix as well as the `batchNumber` and `fromBatch` keywords. | ||
| An additional "kb" prefix (used for documented resources) is defined with the `prefixes` keyword. | ||
|
|
||
| ```yaml | ||
| --- | ||
|
|
||
| # Custom context | ||
| "@context": | ||
| myonto: http://example.com/myonto# | ||
|
|
||
| batchNumber: | ||
| "@id": myonto:batchNumber | ||
| "@type": xsd:integer | ||
|
|
||
| fromBatch: | ||
| "@id": myonto:fromBatch | ||
| "@type": "@id" | ||
|
|
||
|
|
||
| # Additional prefixes | ||
| prefixes: | ||
| kb: http://example.com/kb# | ||
|
|
||
|
|
||
| resources: | ||
| # Samples | ||
| - "@id": kb:sampleA | ||
| "@type": chameo:Sample | ||
| fromBatch: kb:batch1 | ||
|
|
||
| - "@id": kb:sampleB | ||
| "@type": chameo:Sample | ||
| fromBatch: kb:batch1 | ||
|
|
||
| - "@id": kb:sampleC | ||
| "@type": chameo:Sample | ||
| fromBatch: kb:batch2 | ||
|
|
||
| # Batches | ||
| - "@id": kb:batch1 | ||
| "@type": myonto:Batch | ||
| batchNumber: 1 | ||
|
|
||
| - "@id": kb:batch2 | ||
| "@type": myonto:Batch | ||
| batchNumber: 2 | ||
| ``` | ||
|
|
||
| You can save this context to a triplestore with | ||
|
|
||
| ```python | ||
| >>> from tripper import Triplestore | ||
| >>> from tripper.dataset import save_datadoc | ||
| >>> | ||
| >>> ts = Triplestore("rdflib") | ||
| >>> save_datadoc( # doctest: +ELLIPSIS | ||
| ... ts, | ||
| ... "https://raw.githubusercontent.com/EMMC-ASBL/tripper/refs/heads/dataset-docs/tests/input/custom_context.yaml", | ||
| ... ) | ||
| AttrDict(...) | ||
|
|
||
| ``` | ||
|
|
||
| The content of the triplestore should now be | ||
|
|
||
| ```python | ||
| >>> print(ts.serialize()) | ||
| @prefix chameo: <https://w3id.org/emmo/domain/characterisation-methodology/chameo#> . | ||
| @prefix kb: <http://example.com/kb#> . | ||
| @prefix myonto: <http://example.com/myonto#> . | ||
| @prefix owl: <http://www.w3.org/2002/07/owl#> . | ||
| @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . | ||
| <BLANKLINE> | ||
| kb:sampleA a owl:NamedIndividual, | ||
| chameo:Sample ; | ||
| myonto:fromBatch kb:batch1 . | ||
| <BLANKLINE> | ||
| kb:sampleB a owl:NamedIndividual, | ||
| chameo:Sample ; | ||
| myonto:fromBatch kb:batch1 . | ||
| <BLANKLINE> | ||
| kb:sampleC a owl:NamedIndividual, | ||
| chameo:Sample ; | ||
| myonto:fromBatch kb:batch2 . | ||
| <BLANKLINE> | ||
| kb:batch2 a myonto:Batch, | ||
| owl:NamedIndividual ; | ||
| myonto:batchNumber 2 . | ||
| <BLANKLINE> | ||
| kb:batch1 a myonto:Batch, | ||
| owl:NamedIndividual ; | ||
| myonto:batchNumber 1 . | ||
| <BLANKLINE> | ||
| <BLANKLINE> | ||
|
|
||
| ``` | ||
|
|
||
|
|
||
| ### Table | ||
| TODO | ||
|
|
||
|
|
||
|
|
||
| User-defined resource types | ||
| --------------------------- | ||
| TODO | ||
|
|
||
| Extending the list of predefined [resource types] it not implemented yet. | ||
|
|
||
| Since JSON-LD is not designed for categorisation, new resource types should not be added in a custom JSON-LD context. | ||
| Instead, the list of available resource types should be stored and retrieved from the knowledge base. | ||
|
|
||
|
|
||
|
|
||
| [Documenting a resource]: ../documenting-a-resource | ||
| [With custom context]: #with-custom-context | ||
| [User-defined keywords]: #user-defined-keywords | ||
| [resource types]: ../introduction#resource-types | ||
| [predefined prefixes]: ../prefixes | ||
| [predefined keywords]: ../keywords | ||
| [save_dict()]: ../../api_reference/dataset/dataset/#tripper.dataset.dataset.save_dict | ||
| [as_jsonld()]: ../../api_reference/dataset/dataset/#tripper.dataset.dataset.as_jsonld | ||
| [save_datadoc()]: | ||
| ../../api_reference/dataset/dataset/#tripper.dataset.dataset.save_datadoc | ||
| [TableDoc.parse_csv()]: ../../api_reference/dataset/tabledoc/#tripper.dataset.tabledoc.TableDoc.parse_csv | ||
| [default JSON-LD context]: https://raw.githubusercontent.com/EMMC-ASBL/tripper/refs/heads/master/tripper/context/0.2/context.json | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.