This repository contains everything related to tool discovery and software metadata in CLARIAH.
Dockerfile
: The docker container for the CLARIAH Tool Discovery pipeline, including both the harvester and the server and API powering the CLARIAH Tool Store.source-registry/
: The tool source registry, contains the source repositories locations and service endpoints for all CLARIAH tools. This is open for contributionsetc/
,static/
: supporting files for the deployment atlegacy/cmdi/
: Contains legacy CMDI metadata as gathered in WP3 task MD4T at Utrecht University
The tool discovery service, consisting of a harvester that runs on regular intervals (each night) and a tool store, is deployed at https://tools.clariah.nl (production, may not be available yet at this time!) and https://tools.dev.clariah.nl (development).
All harvested data is also available as individual files via https://tools.dev.clariah.nl/files/
- Tool Discovery kanban board - Project planning
- CLARIAH Tool Discovery Presentation - Presented at CLARIAH Tech Day
For CLARIAH (local development):
docker build -t clariah-tool-discovery .
docker run -itd -p 8080:80 --env-file=local-dev.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped clariah-tool-discovery
We recommend you to also pass an extra --env GITHUB_TOKEN=..........
or you will likely hit GitHub's API rate limit during harvestinh. Similarly you can pass a ZENODO_ACCESS_TOKEN
More generic:
docker build -t codemeta-server-tool --build-arg nginx_pass=some_password .
docker run -itd -p 80:80 --env-file=my-env.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped codemeta-server-tool
To use local yamls for sources harvesting (rather than a remote git repo); add to run -v $PWD/source-registry/:/usr/src/source-registry/source-registry/
and set LOCAL_SOURCE_REGISTRY=true
in my-env.env
.
Event-based collection, i.e. allowing clients to pushing codemeta files, can be enabled by setting --env-arg UPLOADER=true
, you can then POST your codemeta.json file with curl -u <nginx-user> -XPOST -H "Content-Type: application/json" -dcodemeta.json -u user <url>/rest/
For private git repo add to docker run -e GIT_USER='youruser' -e GIT_PASSWORD='yourtoken'
To clean up remove the volume codemeta_volume
If you want to query the Tool Store from other software, please read this document for instructions on how to use our SPARQL endpoint.