Proof-of-concept for an Open Source software catalog based on the Publiccode standard.
flowchart TB
subgraph OSS_CATALOG[OSS Catalog]
direction LR
db[(PostgreSQL)]
api[Developers Italia API]
crawler[Publiccode Crawler]
web[OSS Catalog Web App]
api --> db
crawler -- "Read publishers<br>Create softwares<br>(REST)" --> api
web -- "CRUD publishers<br>Read softwares<br>(REST)" --> api
web -. "Trigger crawling after publisher update?" .-> crawler
end
user((User))
admin((Admin))
github[GitHub, GitLab, others]
user -- "Read softwares<br>(HTTP)" --> web
crawler -- "Scan repositories<br>Parse publiccode.yml" --> github
admin -- "CRUD publishers<br>(HTTP)" --> web
What you will do:
- Clone this repo
- Generate PASETO key & create GitHub API token
- Start API service (and DB)
- Start catalog web application
- Add publisher(s) via catalog web application
- Run crawler
- Observe the collected softwares in the catalog web application
- Build and run the new astro based client
Clone this repository:
git clone git@github.com:puzzle/oss-catalog.git
cd oss-catalog/
git submodule init
git submodule update
Generate PASETO key:
./paseto/generate-paseto-key.sh
Create GitHub API Token with the public_repo
Permission under https://github.com/settings/tokens and add it to the .env file:
echo "GITHUB_TOKEN=<your access token>" >> .env
Start API with database:
./start-api
Generate PASETO token (valid for 24h):
source .env
cd paseto/go
PASETO_TOKEN="$(go run paseto-generate.go $PASETO_KEY)"
List publishers (no authentication needed):
curl http://localhost:3000/v1/publishers
Create a publisher:
curl -X POST -H "Authorization: Bearer $PASETO_TOKEN" -H "Content-Type: application/json" -d '{"codeHosting": [{"url": "https://github.com/swiss/", "group": true}], "description": "Swiss Government"}' http://localhost:3000/v1/publishers
repo-scanner/repos.txt
contains a list of repositories to be added to the API.
Run the script:
cd repo-scanner/
nvm use
PASETO_TOKEN=$PASETO_TOKEN API_ENDPOINT=http://localhost:3000 npm start
Run crawler - this will crawl all repositories in the API, checks for publiccode.yml and add them to the database if available.
./start-crawler
Start the catalog client application:
./start-client
Or start outside of Docker in development mode:
cd client/
nvm use
npm install
npm run dev
Then visit http://localhost:4321
- Grab PASETO key from production, e.g. from your vault.
- Set PASETO_KEY environment variable
PASETO_KEY="<your paseto key>"
- Generate the paseto token
cd paseto/go PASETO_TOKEN="$(go run paseto-generate.go $PASETO_KEY)"
- Run the repository script (it's safe to push repos multiple times!)
cd repo-scanner/ nvm use PASETO_TOKEN=$PASETO_TOKEN API_ENDPOINT=<your api endpoint> npm start
- The actual deletion of a publisher or software in the database takes a while (as if it is done asynchronously in the API).
- publiccode.yml Standard
- publiccode.yml crawler for the software catalog of Developers Italia
- Fetches registered publishers from the Developers Italia API, crawles all their repositories & feeds the publiccode.yml results into the Developers Italia API.
- publiccode.yml parser for Go by Developers Italia – Used by the publiccode.yml crawler
- Developers Italia API – Stores the results of the publiccode.yml crawler in a PostgreSQL db, runs at https://api.developers.italia.it/v1/software
- publiccode yml Editor by Developers Italia – Web UI to conveniently edit publiccode.yml files
- Developers Italia website – Italy's OSS catalog (Jekyll site)
- Downloads crawled softwares from Developers Italia API: https://github.com/italia/developers.italia.it/blob/main/scripts/get-software.js
- More publicode.yml components by Developers Italia