Skip to content

Commit 5fdf78b

Browse files
authored
Merge branch 'download-capabilities' into nextcloudclient
2 parents a56f01d + faf7f65 commit 5fdf78b

File tree

8 files changed

+444
-93
lines changed

8 files changed

+444
-93
lines changed

Dockerfile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
FROM python:3.10-slim
2+
3+
WORKDIR /data
4+
5+
COPY . .
6+
7+
# Install dependencies
8+
RUN pip install .
9+
10+
# Use ENTRYPOINT for the CLI
11+
ENTRYPOINT ["databusclient"]

README.md

Lines changed: 101 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -65,15 +65,18 @@ Options:
6565
6666
Commands:
6767
deploy
68-
downoad
68+
download
6969
```
70+
71+
## Docker Image Usage
72+
73+
A docker image is available at [dbpedia/databus-python-client](https://hub.docker.com/r/dbpedia/databus-python-client). See [download section](#usage-of-docker-image) for details.
74+
7075
### Deploy command
7176
```
7277
databusclient deploy --help
7378
```
7479
```
75-
76-
7780
Usage: databusclient deploy [OPTIONS] DISTRIBUTIONS...
7881
7982
Arguments:
@@ -82,23 +85,23 @@ Arguments:
8285
content variants of a distribution, fileExt and Compression can be set, if not they are inferred from the path [required]
8386
8487
Options:
85-
--versionid TEXT target databus version/dataset identifier of the form <h
88+
--version-id TEXT Target databus version/dataset identifier of the form <h
8689
ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
8790
RSION> [required]
88-
--title TEXT dataset title [required]
89-
--abstract TEXT dataset abstract max 200 chars [required]
90-
--description TEXT dataset description [required]
91-
--license TEXT license (see dalicc.net) [required]
92-
--apikey TEXT apikey [required]
91+
--title TEXT Dataset title [required]
92+
--abstract TEXT Dataset abstract max 200 chars [required]
93+
--description TEXT Dataset description [required]
94+
--license TEXT License (see dalicc.net) [required]
95+
--apikey TEXT API key [required]
9396
--help Show this message and exit.
9497
```
9598
Examples of using deploy command
9699
```
97-
databusclient deploy --versionid https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
100+
databusclient deploy --version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
98101
```
99102

100103
```
101-
databusclient deploy --versionid https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
104+
databusclient deploy --version-id https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
102105
```
103106

104107
A few more notes for CLI usage:
@@ -107,6 +110,93 @@ A few more notes for CLI usage:
107110
* For complete inferred: Just use the URL with `https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml`
108111
* If other parameters are used, you need to leave them empty like `https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml||yml|7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653:367116`
109112

113+
### Download command
114+
```
115+
databusclient download --help
116+
```
117+
118+
```
119+
Usage: databusclient download [OPTIONS] DATABUSURIS...
120+
121+
Arguments:
122+
DATABUSURIS... databus uris to download from https://databus.dbpedia.org,
123+
or a query statement that returns databus uris from https://databus.dbpedia.org/sparql
124+
to be downloaded [required]
125+
126+
Download datasets from databus, optionally using vault access if vault
127+
options are provided.
128+
129+
Options:
130+
--localdir TEXT Local databus folder (if not given, databus folder
131+
structure is created in current working directory)
132+
--databus TEXT Databus URL (if not given, inferred from databusuri, e.g.
133+
https://databus.dbpedia.org/sparql)
134+
--token TEXT Path to Vault refresh token file
135+
--authurl TEXT Keycloak token endpoint URL [default:
136+
https://auth.dbpedia.org/realms/dbpedia/protocol/openid-
137+
connect/token]
138+
--clientid TEXT Client ID for token exchange [default: vault-token-
139+
exchange]
140+
--help Show this message and exit. Show this message and exit.
141+
```
142+
143+
Examples of using download command
144+
145+
**File**: download of a single file
146+
```
147+
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2
148+
```
149+
150+
**Version**: download of all files of a specific version
151+
```
152+
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
153+
```
154+
155+
**Artifact**: download of all files with latest version of an artifact
156+
```
157+
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals
158+
```
159+
160+
**Group**: download of all files with lates version of all artifacts of a group
161+
```
162+
databusclient download https://databus.dbpedia.org/dbpedia/mappings
163+
```
164+
165+
If no `--localdir` is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the databus structure, i.e. `./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/`.
166+
167+
**Collection**: download of all files within a collection
168+
```
169+
databusclient download https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-12
170+
```
171+
172+
**Query**: download of all files returned by a query (sparql endpoint must be provided with `--databus`)
173+
```
174+
databusclient download 'PREFIX dcat: <http://www.w3.org/ns/dcat#> SELECT ?x WHERE { ?sub dcat:downloadURL ?x . } LIMIT 10' --databus https://databus.dbpedia.org/sparql
175+
```
176+
177+
#### Authentication with vault
178+
179+
For downloading files from the vault, you need to provide a vault token. See [getting-the-access-refresh-token](https://github.com/dbpedia/databus-vault-access?tab=readme-ov-file#step-1-getting-the-access-refresh-token) for details. You can come back here once you have a `vault-token.dat` file. To use it, just provide the path to the file with `--token /path/to/vault-token.dat`.
180+
181+
Example:
182+
```
183+
databusclient download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23 --token vault-token.dat
184+
```
185+
186+
If vault authentication is required for downloading a file, the client will use the token. If no vault authentication is required, the token will not be used.
187+
188+
#### Usage of docker image
189+
190+
A docker image is available at [dbpedia/databus-python-client](https://hub.docker.com/r/dbpedia/databus-python-client). You can use it like this:
191+
192+
```
193+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
194+
```
195+
If using vault authentication, make sure the token file is available in the container, e.g. by placing it in the current working directory.
196+
```
197+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23/fusion_props=all_subjectns=commons-wikimedia-org_vocab=all.ttl.gz --token vault-token.dat
198+
```
199+
110200
## Module Usage
111201

112202
### Step 1: Create lists of distributions for the dataset

databusclient/cli.py

Lines changed: 50 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,61 @@
11
#!/usr/bin/env python3
2-
import typer
2+
import click
33
from typing import List
44
from databusclient import client
55

6-
app = typer.Typer()
6+
7+
@click.group()
8+
def app():
9+
"""Databus Client CLI"""
10+
pass
711

812

913
@app.command()
10-
def deploy(
11-
version_id: str = typer.Option(
12-
...,
13-
help="target databus version/dataset identifier of the form "
14-
"<https://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VERSION>",
15-
),
16-
title: str = typer.Option(..., help="dataset title"),
17-
abstract: str = typer.Option(..., help="dataset abstract max 200 chars"),
18-
description: str = typer.Option(..., help="dataset description"),
19-
license_uri: str = typer.Option(..., help="license (see dalicc.net)"),
20-
apikey: str = typer.Option(..., help="apikey"),
21-
distributions: List[str] = typer.Argument(
22-
...,
23-
help="distributions in the form of List[URL|CV|fileext|compression|sha256sum:contentlength] where URL is the "
24-
"download URL and CV the "
25-
"key=value pairs (_ separated) content variants of a distribution. filext and compression are optional "
26-
"and if left out inferred from the path. If the sha256sum:contentlength part is left out it will be "
27-
"calcuted by downloading the file.",
28-
),
29-
):
30-
typer.echo(version_id)
31-
dataid = client.create_dataset(
32-
version_id, title, abstract, description, license_uri, distributions
33-
)
14+
@click.option(
15+
"--version-id", "version_id",
16+
required=True,
17+
help="Target databus version/dataset identifier of the form "
18+
"<https://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VERSION>",
19+
)
20+
@click.option("--title", required=True, help="Dataset title")
21+
@click.option("--abstract", required=True, help="Dataset abstract max 200 chars")
22+
@click.option("--description", required=True, help="Dataset description")
23+
@click.option("--license", "license_url", required=True, help="License (see dalicc.net)")
24+
@click.option("--apikey", required=True, help="API key")
25+
@click.argument(
26+
"distributions",
27+
nargs=-1,
28+
required=True,
29+
)
30+
def deploy(version_id, title, abstract, description, license_url, apikey, distributions: List[str]):
31+
"""
32+
Deploy a dataset version with the provided metadata and distributions.
33+
"""
34+
click.echo(f"Deploying dataset version: {version_id}")
35+
dataid = client.create_dataset(version_id, title, abstract, description, license_url, distributions)
3436
client.deploy(dataid=dataid, api_key=apikey)
3537

3638

3739
@app.command()
38-
def download(
39-
localDir: str = typer.Option(..., help="local databus folder"),
40-
databus: str = typer.Option(..., help="databus URL"),
41-
databusuris: List[str] = typer.Argument(...,help="any kind of these: databus identifier, databus collection identifier, query file")
42-
):
43-
client.download(localDir=localDir,endpoint=databus,databusURIs=databusuris)
40+
@click.argument("databusuris", nargs=-1, required=True)
41+
@click.option("--localdir", help="Local databus folder (if not given, databus folder structure is created in current working directory)")
42+
@click.option("--databus", help="Databus URL (if not given, inferred from databusuri, e.g. https://databus.dbpedia.org/sparql)")
43+
@click.option("--token", help="Path to Vault refresh token file")
44+
@click.option("--authurl", default="https://auth.dbpedia.org/realms/dbpedia/protocol/openid-connect/token", show_default=True, help="Keycloak token endpoint URL")
45+
@click.option("--clientid", default="vault-token-exchange", show_default=True, help="Client ID for token exchange")
46+
def download(databusuris: List[str], localdir, databus, token, authurl, clientid):
47+
"""
48+
Download datasets from databus, optionally using vault access if vault options are provided.
49+
"""
50+
client.download(
51+
localDir=localdir,
52+
endpoint=databus,
53+
databusURIs=databusuris,
54+
token=token,
55+
auth_url=authurl,
56+
client_id=clientid,
57+
)
58+
59+
60+
if __name__ == "__main__":
61+
app()

0 commit comments

Comments
 (0)