Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
8fbb5ab
Added allow_redefine argument
jesper-friis Nov 11, 2025
448a3d4
Reformatted
jesper-friis Nov 12, 2025
25f8a30
Added test for redefininf keywords and start of doc for ttl->keywords
francescalb Nov 12, 2025
edf51c3
Merge branch 'allow-keyword-redefinition' of github.com:EMMC-ASBL/tri…
francescalb Nov 12, 2025
dc36e04
strict to false in test
francescalb Nov 12, 2025
78e5e2e
Updated behaviour of redefining keywords
jesper-friis Nov 12, 2025
76e594e
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] Nov 12, 2025
b3baa68
Pre-commit changes
jesper-friis Nov 12, 2025
96cddfb
Merge branch 'allow-keyword-redefinition' of github.com:EMMC-ASBL/tri…
jesper-friis Nov 12, 2025
3d81884
Cleanup
jesper-friis Nov 13, 2025
c4111ff
cleanup
jesper-friis Nov 13, 2025
b344ebf
Merge branch 'allow-keyword-redefinition' of bifrost:~/prosjekter/EMM…
jesper-friis Nov 13, 2025
695c69c
Fixed implementation issue
jesper-friis Nov 14, 2025
fb78760
Adding missing test files
jesper-friis Nov 14, 2025
8fe8bb0
Added missing test file
jesper-friis Nov 14, 2025
39cf961
Apply suggestion from @jesper-friis
francescalb Nov 18, 2025
ea1baaf
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] Nov 18, 2025
f9c3f2c
Replaced some warnings for information to info-level logging.
jesper-friis Nov 18, 2025
64fd2df
Changed yet another warning to logger.info
jesper-friis Nov 18, 2025
f5e3071
Reintroduced RedefineKeywordWarning
jesper-friis Nov 18, 2025
4ed4e38
Added SkipRedefineKeywordWarning
jesper-friis Nov 18, 2025
db7ba09
Updated documentation
francescalb Nov 18, 2025
730650c
typos
francescalb Nov 18, 2025
14a3381
Update docs/datadoc/customisation.md
francescalb Nov 18, 2025
111b0c2
Merge branch 'master' into allow-keyword-redefinition
francescalb Nov 18, 2025
2acefb1
Finished example
francescalb Nov 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
536 changes: 0 additions & 536 deletions CHANGELOG.md

Large diffs are not rendered by default.

21 changes: 0 additions & 21 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +0,0 @@
MIT License

Copyright (c) 2022-2025 SINTEF

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
49 changes: 48 additions & 1 deletion docs/datadoc/customisation.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,56 @@ In your ontology you may define `fromBatch` as a object property with IRI: http:
Here the special value "@id" for the "@type" means that the value of `fromBatch` must be an IRI.


Creating a context with keywords from an ontology
-------------------------------------------------
Creating a context with keywords manually can be strenuous and is prone to human mistakes.
It is therefore advisable to only use one source of truth, namely the ontology.


The context can be generated from a triplestore with the ontology with the Keywords class:

```python
from tripper import Triplestore
from tripper.datadoc import get_keywords

ts = Triplestore('rdflib')

ts.parse(
'https://raw.githubusercontent.com/EMMC-ASBL/tripper/refs/heads/master/tests/ontologies/family.ttl',
format='turtle',
)

kw = get_keywords() # create an Keywords instance populated with the default keywords (ddoc:datadoc)
# Before loading the keywords file it is required that all namespaces have a prefix.
# The family namespace does not have prefix by defualt and it must be added:
kw.add_prefix('fam', 'http://onto-ns.com/ontologies/examples/family#')


# We can now load the ontology into the keywords
kw.load_rdf(ts, redefine='skip') # keywords that are already defined are skipped
# or
lw.load_rdf(ts, redefine='allow') # keywords that are already defined are redefined

```


Note that there are a few considerations when generating a context from an ontology:
First of all, labels that are the same as predefined keywords must be handled with care.
The default behaviour is that if this is attempted, an error is raised (`redefine = raise`).
This choice have been made to ensure that redefining predefined keywords is a conscious decision.
In order to redefine an existing keyword, the argument `redefine` of the `load_rdf()` method must be set to `allow`.
A warning will be emitted for each keyword that is redefined.
In order to generate keywords from an ontology without redefining existing keywords, the `redefine` argument can be set to `skip`, in which case existing keywords are left unchanged and a warning is emitted for each new keyword that is skipped to the advantage of the existing keyword.







Providing a custom context
--------------------------
Custom context can be provided for all the interfaces described in the section [Documenting a resource].
A custom context with defined keywords can be provided for all the interfaces described in the section [Documenting a resource].

### Python dict
Both for the single-resource and multi-resource dicts, you can add a `"@context"` key to the dict who's value is
Expand Down
40 changes: 18 additions & 22 deletions docs/datadoc/keywords.md

Large diffs are not rendered by default.

114 changes: 101 additions & 13 deletions tests/datadoc/test_keywords.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,12 @@ def test_get_keywords():
assert kw5.data.theme == ["ddoc:datadoc", "ddoc:process"]
assert len(kw5.keywords) > len(kw1.keywords)

kw6 = get_keywords(
kw4, yamlfile=testdir / "input" / "custom_keywords.yaml"
)
assert kw4.data.theme == ["ddoc:datadoc", "ddoc:process"]
assert "batchNumber" in kw6


def test_dir():
"""Test `dir(keywords)`."""
Expand All @@ -78,6 +84,28 @@ def test_copy():
assert copy.theme == keywords.theme


def test_add():
"""Test add() method."""
from dataset_paths import indir # pylint: disable=import-error

from tripper.datadoc import get_keywords

kw = get_keywords(theme=None)
kw1 = kw.copy()
kw1.add("ddoc:datadoc")
assert kw1 == keywords

kw2 = kw.copy()
kw2.add(indir / "custom_keywords.yaml")
assert "distribution" in kw2
assert "batchNumber" in kw2

# Works, but requires that the tests are run from the root directory
# kw3 = kw.copy()
# kw3.add("./tests/input/custom_keywords.yaml")
# assert kw3 == kw2


def test_load_yaml():
"""Test load_yaml() method. Most of it is already tested via get_keywords().
Only a few additional tests are added here.
Expand All @@ -86,29 +114,51 @@ def test_load_yaml():

from tripper.datadoc.errors import ParseError

kw = keywords.copy()

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords0.yaml")
kw.load_yaml(indir / "invalid_keywords0.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords1.yaml")
kw.load_yaml(indir / "invalid_keywords1.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords2.yaml")
kw.load_yaml(indir / "invalid_keywords2.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords3.yaml")
kw.load_yaml(indir / "invalid_keywords3.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords4.yaml")
kw.load_yaml(indir / "invalid_keywords4.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords5.yaml")
kw.load_yaml(indir / "invalid_keywords5.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords6.yaml")
kw.load_yaml(indir / "invalid_keywords6.yaml")

with pytest.raises(ParseError):
keywords.load_yaml(indir / "invalid_keywords7.yaml")
kw.load_yaml(indir / "invalid_keywords7.yaml")

with pytest.raises(ParseError):
kw.load_yaml(indir / "invalid_keywords8.yaml")

with pytest.raises(ParseError):
kw.load_yaml(indir / "invalid_keywords9.yaml")

with pytest.raises(ParseError):
kw.load_yaml(indir / "invalid_keywords9.yaml", redefine="xxx")

# keywords are unchanged by failures
# assert kw == keywords

kw.load_yaml(indir / "invalid_keywords9.yaml", redefine="skip")
assert kw["title"].iri == "dcterms:title"

kw.load_yaml(indir / "invalid_keywords9.yaml", redefine="allow")
assert kw["title"].iri == "myonto:a"

kw.load_yaml(indir / "valid_keywords.yaml")


def test_save_yaml():
Expand Down Expand Up @@ -144,7 +194,7 @@ def test_load_table():
assert kw.keywords.ref == {
"iri": "ex:ref",
"type": "owl:AnnotationProperty",
"domain": ["dcat:Resource", "rdfs:Resource"], # is this intended?
"domain": ["dcat:Resource", "rdfs:Resource"],
"range": "rdfs:Literal",
"datatype": "rdf:langString",
"conformance": "optional",
Expand Down Expand Up @@ -174,14 +224,14 @@ def test_save_table():
with open(outdir / "keywords.csv", "rt", encoding="utf-8") as f:
header = f.readline().strip().split(",")
row1 = f.readline().strip().split(",")
assert len(header) == 11
assert len(header) == 9
facit = [
("@id", "dcterms:accessRights"),
("@type", "owl:ObjectProperty"),
("label", "accessRights"),
("domain", "dcat:Resource"),
("domain", ""),
("domain", ""),
# ("domain", ""),
# ("domain", ""),
("range", "dcterms:RightsStatement"),
("conformance", "ddoc:optional"),
(
Expand Down Expand Up @@ -389,9 +439,10 @@ def test_load2():
ts = Triplestore("rdflib")
ts.parse(ontodir / "family.ttl")

# Create an empty Keywords object and load the ontology
kw = get_keywords(theme=None)
assert kw.keywords == AttrDict()
kw.load_rdf(ts)
kw.load_rdf(ts, strict=False, redefine="allow")

assert set(kw.keywordnames()) == {
"hasAge",
Expand Down Expand Up @@ -421,6 +472,43 @@ def test_load2():
"name": "hasName",
}

ts = Triplestore("rdflib")
ts.parse(ontodir / "family.ttl")

# Create a new Keywords object with
# default keywords and load from the triplestore
kw2 = get_keywords()
kw2.load_rdf(ts, redefine="allow")

# Ensure that the specified keywords are in kw2
assert not {
"hasAge",
"hasWeight",
"hasSkill",
"hasChild",
"hasName",
}.difference(kw2.keywordnames())
assert not {
"Person",
"Parent",
"Child",
"Skill",
"Resource",
}.difference(kw2.classnames())
d = kw2["hasAge"]
assert d.iri == "fam:hasAge"
assert d.range == "rdfs:Literal"
assert d.datatype == "xsd:double"
assert d.unit == "year"
assert kw2["hasName"] == { # vcard:hasName is overwritten
"iri": "fam:hasName",
"type": "owl:AnnotationProperty",
"domain": ["dcat:Resource", "rdfs:Resource"],
"range": "rdfs:Literal",
"comment": "Name.",
"name": "hasName",
}


def test_get_prefixes():
"""Test get_prefixes() method."""
Expand Down
1 change: 1 addition & 0 deletions tests/input/invalid_keywords0.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Invalid theme
---
basedOn: ["ddoc:invalid", ]

Expand Down
1 change: 1 addition & 0 deletions tests/input/invalid_keywords1.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Invalid key in keyword description
---
basedOn: ["ddoc:datadoc", ]

Expand Down
3 changes: 2 additions & 1 deletion tests/input/invalid_keywords2.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Invalid conformance value
---
basedOn: ["ddoc:datadoc", ]

Expand All @@ -14,4 +15,4 @@ resources:
a:
iri: myonto:a
range: myonto:B
conformance: Some invalid key.
conformance: Some invalid conformance value.
1 change: 1 addition & 0 deletions tests/input/invalid_keywords3.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Invalid description
---
basedOn: ["ddoc:datadoc", ]

Expand Down
2 changes: 1 addition & 1 deletion tests/input/invalid_keywords4.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Missing keyword IRI
---
basedOn: "ddoc:datadoc"

Expand All @@ -12,5 +13,4 @@ resources:
subClassOf: dcat:Resource
keywords:
description:
iri: myonto:description
range: myonto:B
1 change: 1 addition & 0 deletions tests/input/invalid_keywords5.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Redefine existing prefix
---
basedOn: "ddoc:datadoc"

Expand Down
1 change: 1 addition & 0 deletions tests/input/invalid_keywords6.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Missing class IRI
---
basedOn: "ddoc:datadoc"

Expand Down
1 change: 1 addition & 0 deletions tests/input/invalid_keywords7.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Invalid class keyword
---
basedOn: "ddoc:datadoc"

Expand Down
17 changes: 17 additions & 0 deletions tests/input/invalid_keywords8.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Redefine existing concept
---
basedOn: "ddoc:datadoc"

prefixes:
kb: "http://example.com/kb#"
myonto: "http://example.com/myonto#"


resources:
A:
iri: myonto:A
subClassOf: dcat:Resource
keywords:
description:
iri: dcat:Dataset
range: myonto:B
17 changes: 17 additions & 0 deletions tests/input/invalid_keywords9.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Redefine existing keyword
---
basedOn: "ddoc:datadoc"

prefixes:
kb: "http://example.com/kb#"
myonto: "http://example.com/myonto#"


resources:
A:
iri: myonto:A
subClassOf: dcat:Resource
keywords:
title:
iri: myonto:a
range: myonto:B
17 changes: 17 additions & 0 deletions tests/input/valid_keywords.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Valid
---
basedOn: "ddoc:datadoc"

prefixes:
kb: "http://example.com/kb#"
myonto: "http://example.com/myonto#"


resources:
A:
iri: myonto:A
subClassOf: dcat:Resource
keywords:
newrelation:
iri: myonto:newrelation
range: myonto:B
Loading