A Node.js application for collecting Github statistics into a SQLite or PostgreSQL database.
This project was built with simplicity and ease of use in mind. We simply wanted GitHub data in a relational database which we could then create visualisations for using Metabase (https://www.metabase.com/).
Install from npm as a global command
> npm i @zalando/roadblock -g
Run the roadblock command in an empty folder
> roadblock
This will generate a basic roadblock.json
file which you can then modify:
{
"github": {
"token": "xxx",
"url" : "https://internal.gith.ub/api/v3"
},
"tasks": [
"pre/*", "org/*", "repo/releases"
],
"orgs": [
"My-Org", "Second-org"
]
}
Github.token a github token is required to access most data - ensure that the token have read access to repo
, repo:status
, public_repo
, read:org
, read:user
, read:discussion
.
url is only required if you want to collect data from Github Enterprise.
Tasks Specify what data you want to collect, by default it is set to * which means
run all possible tasks.
Either use wildcards like *
or org/*
or set to
specific tasks like repo/issues
or ignore specific tasks with !repo/profiles
Orgs By default roadblock will attempt to collect from all orgs, which the token have access to, to filter or to query additional orgs, set them here.
Use either: *
orgname
or !orgname
.
Configuration values can also be passed from the command line to avoid storing tokens in clear text:
> roadblock github.token=YOURTOKEN
> roadblock orgs=["zalando","custom"]
Script will run between 10 and 20 minutes and store collected data in a SQLite datase - you can also configure a postgres instance if needed.
Sqlite Database and json summaries will be stored in the folder where the roadblock
is invoked.
The task system in roadblock divides the different data collection tasks into 4 seperate phases.
Tasks to collect initial data points, default is to collect configured organisations
pre/organisations
- Collects available organisations from the user and configured orgspre/calendar
- Creates a calendar table with years and months, helpful when querying data
Tasks run for each seperate organisation, each task is passed an organisation object to process data based on.
org/members
- collect all members of the organisationsorg/repository
- collect all public, non-fork repositories on the organisationorg/vulnerabilities
- collect all security alerts from all repositories on the organisation
Tasks run for each collected repository.
repo/collaborators
- Collect all collaborators on a repositoryrepo/commits
- Collect all repository commitsrepo/contributions
- Collect all contributions (summarised changes)repo/issues
- Collect all issues on the repositoriesrepo/profiles
- Repository health / community profilerepo/pullrequests
- Repository pull requestsrepo/releases
- All releases on repositoryrepo/topics
- Repository topics
Tasks to run after all org and repo data collection is completed
post/export
- Export organisation and repository stats to json filespost/upstream
- Collect upstream contribution stats from external repositories.
> Clone this project to your local machine
> git clone https://github.com/zalando-incubator/roadblock.git
> cd roadblock
> Run npm install and start collecting data
> npm install
What software you need to install:
- Node.js
- Metabase (optional) - to visualise the collected data
- Sequelize - Node.js ORM
- GhRequestor - Github client for fetching large amounts of data
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
This project is licensed under the MIT License - see the LICENSE.md file for details