Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
ea2c36b
Started solveng eiisues with keywords
francescalb Apr 3, 2025
351e0bd
[pre-commit.ci] auto fixes from pre-commit hooks
pre-commit-ci[bot] Apr 3, 2025
9760246
Merge branch 'master' into keyword_bugs
francescalb Apr 3, 2025
e9ba8fc
Merge branch 'master' into keyword_bugs
jesper-friis Apr 3, 2025
dc1e4f6
Merge branch 'keyword_bugs' of github.com:EMMC-ASBL/tripper into keyw…
jesper-friis Apr 3, 2025
683c012
Allow to use the same keyword in several resource types
jesper-friis Apr 3, 2025
c0b744b
Defined missing prefix - needed by fuseki and GraphDB
jesper-friis Apr 3, 2025
54ea327
Added iana prefix
francescalb Apr 3, 2025
6d71895
Added --debug option to command-line tool
jesper-friis Apr 3, 2025
691b101
Updated clitool and added custum csv sniffer
jesper-friis Apr 4, 2025
78c827e
Added test for new keyword in recursive_update()
jesper-friis Apr 4, 2025
416d391
Added test for scvsniff()
jesper-friis Apr 4, 2025
2f5f8b4
Added more tests for recursive_update()
jesper-friis Apr 4, 2025
77d57aa
Updated docstring
jesper-friis Apr 4, 2025
e67d07f
Updated test for recursive_update()
jesper-friis Apr 4, 2025
f9cb794
Merge branch 'master' into keyword_bugs
francescalb Apr 22, 2025
e8f655b
try safety directly
francescalb Apr 22, 2025
eab590e
Typo in ci-tests
francescalb Apr 22, 2025
8f9e18d
typo
francescalb Apr 22, 2025
71b0b9d
try adding .safety-config
francescalb Apr 22, 2025
ca055b7
removed safety-config.yml
francescalb Apr 22, 2025
eca957c
Removed safety info that is no longer relevant
francescalb Apr 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 11 additions & 16 deletions .github/workflows/ci_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,22 +29,7 @@ jobs:
--rcfile=pyproject.toml --disable=import-outside-toplevel,redefined-outer-name tests

# safety-specific settings
run_safety: true
# 48547: RDFLib vulnerability: https://pyup.io/vulnerabilities/PVE-2022-48547/48547/
# 44715-44717: NumPy vulnerabilities:
# https://pyup.io/vulnerabilities/CVE-2021-41495/44715/
# https://pyup.io/vulnerabilities/CVE-2021-41496/44716/
# https://pyup.io/vulnerabilities/CVE-2021-34141/44717/
# 70612: Jinja2 vulnerability. Only used as subdependency for mkdocs++ in tripper.
# https://data.safetycli.com/v/70612/97c/
# https://data.safetycli.com/v/72715/97c/ # update to mkdocs>=9.5.32
safety_options: |
--ignore=48547
--ignore=44715
--ignore=44716
--ignore=44717
--ignore=70612
--ignore=72715
run_safety: false

## Build package
run_build_package: true
Expand All @@ -63,6 +48,16 @@ jobs:
update_docs_landing_page: true
package_dirs: tripper

safety:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@main
- name: Run Safety CLI to check for vulnerabilities
uses: pyupio/safety-action@v1
with:
api-key: ${{ secrets.SAFETY_API_KEY }}
args: --detailed-output # To always see detailed output from this action

pytest:
runs-on: ubuntu-latest

Expand Down
10 changes: 9 additions & 1 deletion docs/datadoc/keywords.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!-- Do not edit! This file is generated with Tripper. Edit the keywords.yaml file instead. -->

# Keywords for default
# Keywords for domain: default
The tables below lists the keywords the domain default.

The meaning of the columns are as follows:
Expand Down Expand Up @@ -169,6 +169,7 @@ A collection of operations that provides access to one or more datasets or data
| [endpointURL] | [rdfs:Literal]<br>(xsd:anyURI) | mandatory | The root location or primary endpoint of the service (an IRI). | |
| [endpointDescription] | [rdfs:Resource] | recommended | A description of the services available via the end-points, including their operations, parameters etc. | |
| [servesDataset] | [dcat:Dataset] | recommended | This property refers to a collection of data that this data service can distribute. | |
| [parser] | [oteio:Parser] | | A parser that can parse the distribution. | |


## Properties on [DatasetSeries]
Expand Down Expand Up @@ -257,6 +258,10 @@ A standard or other specification to which a resource conforms.
A media type, e.g. the format of a computer file.


## Properties on [GenericResource]
A generic resource.




[Resource]: http://www.w3.org/ns/dcat#Resource
Expand Down Expand Up @@ -431,6 +436,8 @@ A media type, e.g. the format of a computer file.
[rdfs:Literal]: http://www.w3.org/2000/01/rdf-schema#Literal
[servesDataset]: http://www.w3.org/ns/dcat#servesDataset
[dcat:Dataset]: http://www.w3.org/ns/dcat#Dataset
[parser]: https://w3id.org/emmo/domain/oteio#parser
[oteio:Parser]: https://w3id.org/emmo/domain/oteio#Parser
[dcat:Dataset]: http://www.w3.org/ns/dcat#Dataset
[DatasetSeries]: http://www.w3.org/ns/dcat#DatasetSeries
[Geometry]: http://www.w3.org/ns/locn#Geometry
Expand Down Expand Up @@ -506,3 +513,4 @@ A media type, e.g. the format of a computer file.
[LegalResource]: http://data.europa.eu/eli/ontology#LegalResource
[Standard]: http://purl.org/dc/terms/Standard
[MediaType]: http://purl.org/dc/terms/MediaType
[GenericResource]: http://www.w3.org/2000/01/rdf-schema#Resource
1 change: 1 addition & 0 deletions docs/datadoc/prefixes.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ See [User-defined prefixes] for how to extend this list with additional namespac
| emmo | https://w3id.org/emmo# |
| oteio | https://w3id.org/emmo/domain/oteio# |
| chameo | https://w3id.org/emmo/domain/characterisation-methodology/chameo# |
| iana | https://www.iana.org/assignments/media-types/ |


[default JSON-LD context]: https://raw.githubusercontent.com/EMMC-ASBL/tripper/refs/heads/master/tripper/context/0.2/context.json
Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,8 @@ nav:
- Units and quantities: units/units.md
- Session: session.md
- ... | api_reference/**
- Known issues: known-issues.md
- For developers: developers.md
- Known issues: known-issues.md
- Changelog: CHANGELOG.md
- License: LICENSE.md

Expand Down
8 changes: 8 additions & 0 deletions tests/datadoc/test_datadoc_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ def test_delete():
iri = "semdata:SEM_cement_batch2/77600-23-001/77600-23-001_5kV_400x_m001"

cmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"delete",
Expand All @@ -27,6 +28,7 @@ def test_delete():

# Ensure that KB doesn't contain the removed dataset
findcmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"find",
Expand All @@ -44,6 +46,7 @@ def test_delete_regex():
iri_regexp = "https://he-matchmaker.eu/data/sem/.*"

cmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"delete",
Expand All @@ -53,6 +56,7 @@ def test_delete_regex():

# Ensure that KB doesn't contain the removed dataset
findcmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"find",
Expand All @@ -68,6 +72,7 @@ def test_add():
from dataset_paths import indir, outdir # pylint: disable=import-error

cmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"add",
Expand Down Expand Up @@ -103,6 +108,7 @@ def test_find():
from dataset_paths import indir # pylint: disable=import-error

cmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"find",
Expand All @@ -129,6 +135,7 @@ def test_find_json():
from dataset_paths import indir # pylint: disable=import-error

cmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"find",
Expand All @@ -155,6 +162,7 @@ def test_fetch():
)

cmd = [
"--debug",
"--triplestore=FusekiTest",
f"--config={indir/'session.yaml'}",
"fetch",
Expand Down
20 changes: 20 additions & 0 deletions tests/datadoc/test_tabledoc.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,3 +218,23 @@ def test_csv_duplicated_columns():
"distribution.downloadURL",
]
td2.write_csv(outdir / "tem.csv", prefixes=prefixes)


def test_csvsniff():
"""Test csvsniff()."""
pytest.importorskip("yaml")
from tripper.datadoc.tabledoc import csvsniff

lines = [
"A,B,C,D",
"a,'b,bb','c1;c2;c3;c4',d",
]
dialect = csvsniff("\r\n".join(lines))
assert dialect.delimiter == ","
assert dialect.lineterminator == "\r\n"
assert dialect.quotechar == "'"

dialect = csvsniff("\n".join(lines))
assert dialect.delimiter == ","
assert dialect.lineterminator == "\n"
assert dialect.quotechar == "'"
31 changes: 31 additions & 0 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,46 @@ def test_recursive_update():
d = {"a": []}
recursive_update(d, other)
assert d == other
assert isinstance(d["a"][1], dict)

d = {"a": []}
recursive_update(d, other, cls=AttrDict)
assert d == other
assert isinstance(d["a"][1], AttrDict)

d = AttrDict()
recursive_update(d, other)
assert d == other
assert isinstance(d.a[1], AttrDict)

d = {"d": 1}
recursive_update(d, other)
assert d == {"a": [1, {"b": 2, "c": [3, 4]}, 5], "d": [1, 6]}

d = {"d": 1}
recursive_update(d, other, append=False)
assert d == {"a": [1, {"b": 2, "c": [3, 4]}, 5], "d": 6}

d = {"a": {"b": 2}}
recursive_update(d, {"a": [1, {"b": 2}]})
assert d == {"a": [1, {"b": 2}]}

d = {"a": {"b": 2}}
recursive_update(d, {"a": [1, {"b": 2}]}, append=False)
assert d == {"a": [1, {"b": 2}]}

d = {"a": [1, {"b": 2}]}
recursive_update(d, {"a": [1, {"b": 2}]})
assert d == {"a": [1, {"b": 2}]}

d = {"a": {"b": 2}}
recursive_update(d, {"a": [1, {"b": 3}]})
assert d == {"a": [1, {"b": [2, 3]}]}

d = {"a": {"b": 2}}
recursive_update(d, {"a": [1, {"b": 3}]}, append=False)
assert d == {"a": [1, {"b": 3}]}


def test_openfile():
"""Test openfile()."""
Expand Down
4 changes: 3 additions & 1 deletion tripper/context/0.3/context.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
"emmo": "https://w3id.org/emmo#",
"oteio": "https://w3id.org/emmo/domain/oteio#",
"chameo": "https://w3id.org/emmo/domain/characterisation-methodology/chameo#",
"iana": "https://www.iana.org/assignments/media-types/",
"accessRights": {
"@id": "dcterms:accessRights",
"@type": "@id"
Expand Down Expand Up @@ -353,6 +354,7 @@
"RightsStatement": "dcterms:RightsStatement",
"LegalResource": "eli:LegalResource",
"Standard": "dcterms:Standard",
"MediaType": "dcterms:MediaType"
"MediaType": "dcterms:MediaType",
"GenericResource": "rdfs:Resource"
}
}
10 changes: 10 additions & 0 deletions tripper/context/0.3/keywords.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ prefixes:
emmo: "https://w3id.org/emmo#"
oteio: "https://w3id.org/emmo/domain/oteio#"
chameo: "https://w3id.org/emmo/domain/characterisation-methodology/chameo#"
iana: "https://www.iana.org/assignments/media-types/"


resources:
Expand Down Expand Up @@ -577,6 +578,11 @@ resources:
conformance: recommended
description: This property refers to a collection of data that this data service can distribute.

parser:
iri: oteio:parser
range: oteio:Parser
description: A parser that can parse the distribution.


DatasetSeries:
iri: dcat:DatasetSeries
Expand Down Expand Up @@ -742,3 +748,7 @@ resources:
iri: dcterms:MediaType
description: A media type, e.g. the format of a computer file.
usageNote: Media type instances follow the [IANA](https://www.w3.org/TR/vocab-dcat-3/#bib-iana-media-types) vocabulary using the <http://www.iana.org/assignments/media-types/> namespace. For example, the IRI of the media type `text/turtle` is <http://www.iana.org/assignments/media-types/text/turtle>.

GenericResource:
iri: rdfs:Resource
description: A generic resource.
33 changes: 20 additions & 13 deletions tripper/datadoc/clitool.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@
save_datadoc(ts, infile)
elif fmt in ("csv",):
kw = {}
if args.csv_options:
for token in args.csv_options:
if args.csv_option:
for token in args.csv_option:

Check warning on line 35 in tripper/datadoc/clitool.py

View check run for this annotation

Codecov / codecov/patch

tripper/datadoc/clitool.py#L35

Added line #L35 was not covered by tests
option, value = token.split("=", 1)
kw[option] = value
td = TableDoc.parse_csv(
Expand Down Expand Up @@ -131,6 +131,8 @@

def maincommand(argv=None):
"""Main command."""
# pylint: disable=too-many-statements

parser = argparse.ArgumentParser(
description=(
"Tool for data documentation.\n\n"
Expand Down Expand Up @@ -164,13 +166,13 @@
),
)
parser_add.add_argument(
"--csv-options",
action="extend",
nargs="+",
"--csv-option",
action="append",
metavar="OPTION=VALUE",
help=(
"Options describing the CSV dialect for --input-format=csv. "
"Common options are 'dialect', 'delimiter' and 'quotechar'."
"Common options are 'dialect', 'delimiter' and 'quotechar'. "
"This option may be provided multiple times."
),
)
parser_add.add_argument(
Expand Down Expand Up @@ -291,6 +293,9 @@
"-c",
help="Session configuration file.",
)
parser.add_argument(
"--debug", action="store_true", help="Show Python traceback on error."
)
parser.add_argument(
"--triplestore",
"-t",
Expand Down Expand Up @@ -367,17 +372,19 @@
ts.bind(prefix, ns)

# Call subcommand handler
return args.func(ts, args)
try:
return args.func(ts, args)
except Exception as exc: # pylint: disable=broad-exception-caught
if args.debug:
raise
print(f"{exc.__class__.__name__}: {exc}")
return exc

Check warning on line 381 in tripper/datadoc/clitool.py

View check run for this annotation

Codecov / codecov/patch

tripper/datadoc/clitool.py#L377-L381

Added lines #L377 - L381 were not covered by tests


def main(argv=None):
"""Main function."""
try:
maincommand(argv)
except Exception as exc: # pylint: disable=broad-exception-caught
print(exc)
return 1
return 0
retval = maincommand(argv)
return 1 if isinstance(retval, Exception) else 0

Check warning on line 387 in tripper/datadoc/clitool.py

View check run for this annotation

Codecov / codecov/patch

tripper/datadoc/clitool.py#L386-L387

Added lines #L386 - L387 were not covered by tests


if __name__ == "__main__":
Expand Down
Loading