Gitbi is a lightweight BI web app that uses git repository as a source of saved queries, visualizations and dashboards. Everything is stored as text files which can be easily accessed and edited outside of the app (in addition to using web interface).
Features:
- You can write queries using either SQL or PRQL
- Interactive visualizations with ECharts
- Mailing reports and alerts
- Currently supported DBs: clickhouse, duckdb (query csv files), postgresql, sqlite
Test it now with sample dbs and config:
docker run -p 8000:8000 pieca/gitbi:latest
Or view screenshots.
See full deployment example: ppatrzyk/gitbi-example.
Gitbi requires the following to run:
Repository needs to have the following structure:
- directories in repo root refer to databases
- files in each directory are queries/visualizations to be run against respective database
- files with
.sqlor.prqlextension are queries - (optional) files with
.jsonextension are saved visualizations
- files with
- (optional) special directory
_dashboardsthat contains dashboard specifications (.jsonformat) - (optional) README.md file content will be displayed on Gitbi main page
Gitbi uses a YAML configuration file to manage all settings. The application provides a template file app/config.yaml.template that you can use as a starting point. You need to:
-
Copy the template file to create your own configuration:
cp app/config.yaml.template app/config.yaml
-
Edit the
config.yamlfile with your actual settings.
You can also specify a different configuration file location using the GITBI_CONFIG_PATH environment variable.
Example configuration file:
# Repository configuration
repo:
dir: /path/to/your/repo # Repository directory path
# Database configuration
databases:
db1:
type: postgres
connection_string: "postgresql://user:password@localhost:5432/db1"
db2:
type: clickhouse
connection_string: "clickhouse://user:password@localhost:9000/db2"
db3:
type: sqlite
connection_string: "/path/to/db3.sqlite"
db4:
type: duckdb
connection_string: "/path/to/db4.duckdb"
# Email configuration (optional)
email:
smtp_user: your_smtp_user
smtp_pass: your_smtp_pass
smtp_url: smtp.example.com:587
smtp_email: your_email@example.com
# Authentication configuration (optional)
auth:
users:
- user1:password1
- user2:password2Following database types are supported:
| Type | Connection string format |
|---|---|
| clickhouse | clickhouse://[login]:[password]@[host]:[port]/[database] |
| duckdb | path to db file (or :memory:) |
| postgres | postgresql://[userspec@][hostspec][/dbname][?paramspec] |
| sqlite | path to db file (or :memory:) |
Assume you have repository with the following structure:
repo
├── _dashboards
│ └── my_dashboard.json
├── db1
│ ├── query1.sql
│ ├── query2.sql
│ └── query2.sql.json
├── db2
│ ├── query3.sql
│ ├── query3.sql.json
│ ├── query4.sql
│ └── query5.sql
└── README.md
There are 2 databases named db1 and db2. db1 has 2 queries, one of them has also visualization; db2 has 3 queries, 1 with added visualization. There is also one dashboard called my_dashboard.json.
For configuration you'd need to set up the following in your config.yaml:
repo:
dir: /path/to/your/repo
databases:
db1:
type: postgres
connection_string: "postgresql://user:password@localhost:5432/db1"
db2:
type: clickhouse
connection_string: "clickhouse://user:password@localhost:9000/db2"Visualization is a JSON file with the following format:
{
"type": "scatter|line|bar|heatmap",
"xaxis": <column_name>,
"yaxis": <column_name>,
"zaxis": <column_name>,
"group": <column_name>
}
Dashboard is a JSON file with the following format, list can have any number of entries:
[
[
"<db_name>",
"<query_file_name>"
],
[
"<db_name>",
"<query_file_name>"
],
...
]
For every query, you have a report endpoint that provides query results in html and text formats, as well as data only csv and json. This endpoint accepts two optional query parameters:
mail: if this is not empty, gitbi will send an email with the result to specified addressalert: if this is not empty, gitbi will send results via email only if there are some rows returned. Write your alert queries in a way that they usually do not return anything, but you want to be notified when they do. This parameter makes sense only together with email.
Scheduling is possible via any external service, most commonly CRON. Example:
# Report sent at 6am
0 6 * * * curl -s -u <user>:<password> <report_url>?mail=<your_email>
# Alert checked every minute
* * * * * curl -s -u <user>:<password> <report_url>?alert=true&mail=<your_email>
If you don't want to setup email credentials in Gitbi, you can still use CRON to to send reports with other tools. Examples:
# HTML report via sendmail
* * * * * echo -e "Subject: Gitbi report\nContent-Type: text/html\n\n$(curl -s -u <user>:<password> <report_url>)" | /usr/sbin/sendmail -f <sender_email> <recipient_email>
# HTML report via mailgun api
* * * * * curl -X POST --user "api:<mailgun_api_key>" --data-urlencode from=<sender_email> --data-urlencode to=<recipient_email> --data-urlencode subject="Gitbi report" --data-urlencode html="$(curl -s -u <user>:<password> <report_url>)" https://api.eu.mailgun.net/v3/SENDER_DOMAIN/messages
You can copy report_url from every query page.
The easiest way to run Gitbi is to set up a repository at the same server the app is running, and then sync changes into your local repo via ssh. This requires setting proper permissions for everything to work smoothly. Example setup:
# initialize as shared repo
# the command below allows any user in group <GROUP> to push into repo, for other options see https://git-scm.com/docs/git-init
git init --shared=group <REPO_NAME>
chgrp -R <GROUP> <REPO_NAME>
chmod g+rwxs <REPO_NAME>
# enable pushing to checked out branch
git config receive.denyCurrentBranch updateInstead
# install dependencies
pip3 install -r requirements.txt
# run with sample repo
./start_app.sh
# run with testing repo
GITBI_REPO_DIR="./tests/gitbi-testing" GITBI_SQLITE_CONN="$(realpath ./tests/gitbi-testing/db.sqlite)" GITBI_SQLITE_TYPE=sqlite ./start_app.sh
# build image
docker build -t pieca/gitbi:<version> .
- generate static html reports using Python: merkury
- create custom dashboards using SQL and markdown: evidence
- analyze single sqlite db: datasette
- run SQL queries from your browser: sqlpad
- full-blown BI solution: metabase
Backend:
Frontend: