Skip to content

The CLARIAH tool discovery repository holds the Tool Source registry, configurations for the software metadata harvesting pipeline.

Notifications You must be signed in to change notification settings

CLARIAH/tool-discovery

Repository files navigation

GitHub build Project Status: Active -- The project has reached a stable, usable state and is being actively developed.

CLARIAH Tool Discovery

This repository contains everything related to tool discovery and software metadata in CLARIAH.

  • Dockerfile: The docker container for the CLARIAH Tool Discovery pipeline, including both the harvester and the server and API powering the CLARIAH Tool Store.
  • source-registry/: The tool source registry, contains the source repositories locations and service endpoints for all CLARIAH tools. This is open for contributions
  • etc/, static/: supporting files for the deployment at
  • legacy/cmdi/: Contains legacy CMDI metadata as gathered in WP3 task MD4T at Utrecht University

Service

The tool discovery service, consisting of a harvester that runs on regular intervals (each night) and a tool store, is deployed at https://tools.clariah.nl (production, may not be available yet at this time!) and https://tools.dev.clariah.nl (development).

All harvested data is also available as individual files via https://tools.dev.clariah.nl/files/

Links

Usage

For CLARIAH (local development):

docker build -t clariah-tool-discovery .
docker run -itd -p 8080:80 --env-file=local-dev.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped clariah-tool-discovery 

We recommend you to also pass an extra --env GITHUB_TOKEN=.......... or you will likely hit GitHub's API rate limit during harvestinh. Similarly you can pass a ZENODO_ACCESS_TOKEN

More generic:

docker build -t codemeta-server-tool --build-arg nginx_pass=some_password .
docker run -itd -p 80:80 --env-file=my-env.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped codemeta-server-tool 

To use local yamls for sources harvesting (rather than a remote git repo); add to run -v $PWD/source-registry/:/usr/src/source-registry/source-registry/ and set LOCAL_SOURCE_REGISTRY=true in my-env.env.

Event-based collection, i.e. allowing clients to pushing codemeta files, can be enabled by setting --env-arg UPLOADER=true, you can then POST your codemeta.json file with curl -u <nginx-user> -XPOST -H "Content-Type: application/json" -dcodemeta.json -u user <url>/rest/

For private git repo add to docker run -e GIT_USER='youruser' -e GIT_PASSWORD='yourtoken' To clean up remove the volume codemeta_volume

Integration: API usage instructions

If you want to query the Tool Store from other software, please read this document for instructions on how to use our SPARQL endpoint.