This project has been superseded by Promscale. Like this project, Promscale allows easy storage of Prometheus metrics in TimescaleDB + Postgres, but also offers: automatic partitioning, native compression (typically 95% storage savings), native data retention policies, full SQL and PromQL, and more.
You can find the new project at https://github.com/timescale/promscale and more details can be found in the design document.
This project will continue only in maintenance mode.
Adapter to connect Prometheus' remote write endpoint to a PostgreSQL server, preferably running TimescaleDB. Caches labels for each timeseries to reduce writes, linking them to samples by metric fingerprint.
This adapter was inspired by the Timescale PostgreSQL adapter
and maintains a compatible schema, so queries may be used with either, but this adapter does not require the
pg_prometheus
extension, making it compatible with Aurora PostgreSQL, Azure Database for PostgreSQL, and other
managed PostgreSQL services.
While it is possible to use this adapter and most of the schema without TimescaleDB, it will become difficult to prune older data, compression will not be available, and queries will be slower. If you can use TimescaleDB, please do.
- batteries included
- example queries for alerting, reports, and schema metadata
- Grafana dashboards for Kubernetes workloads, hardware metrics, and schema metadata
- Jupyter notebooks for long-term reporting
- PostgreSQL server image with schema and grant setup scripts
- Prometheus rules for derived metrics
- compatible schema
- query compatible with Timescale's official
pg_prometheus
schema - does not require
pg_prometheus
extension - does not require superuser or extension privileges
- query compatible with Timescale's official
- efficient storage
- hashed & cached label IDs
- samples in compressed hypertable
- uses Go SQL and bulk copy for samples
- create the schema:
- deploy
kubernetes/server.yml
- or
docker run --rm -p 5432:5432 ssube/prometheus-sql-adapter:master-postgres-11 -c 'shared_preload_libraries=timescaledb'
- or run
./scripts/schema-create [license-level] [retain-live] [retain-total]
against an existing database
- deploy
- configure adapters:
- create a role for each set of adapters to write
- run
./scripts/schema-grant.sh [role-name] adapter
- deploy
kubernetes/adapter.yml
- configure Grafana:
- create a role for each Grafana instance to read
- run
./scripts/schema-grant.sh [role-name] grafana
- add a Postgres data source
- import dashboards from
grafana/
- configure humans:
- create a role for each developer to read
- run
./scripts/schema-grant.sh [role-name] human
- show off your sweet new graphs
The schema scripts are idempotent and safe to run repeatedly, including schema-create.sh
.
Non-breaking upgrades can be performed by running the schema scripts again, in the same order.
This schema is compatible with the Timescale pg_prometheus
adapter
but does not require the pg_prometheus
extension or SUPERUSER
permissions.
Captured labels and samples are split into two tables, with labels stored uniquely and identified by their FNV-1a hash, to which samples are tied.
Labels are stored once for each timeseries using the metric's hashed fingerprint, or label ID (lid
). This is
provided by the Prometheus SDK and uses the 64-bit FNV-1a hash, which is then stored as a UUID column. The remote
write adapters each maintain an LRU cache of label IDs, and can be configured not to rewrite labels.
Using the metric's fingerprint provides a short, deterministic identifier for each label set, or timeseries. The
adapters do not need to coordinate and can safely write in parallel, using an ON CONFLICT
clause to skip or
update existing label sets. While a numeric counter might be shorter than the current hash-as-UUID, it would
require coordination between the adapters or within the database. The hashed lid
avoids lookups when writing to
an existing timeseries.
Maximum time ranges may be enforced by the metrics
view to limit the amount of raw data that can be fetched at
once, but deduplication and aggregation typically need more context to determine the correct operators, and must
happen later. The maximum range allowed by metrics
is used for the compression policy as well, compacting data
when it can no longer be queried directly. Finally, the view makes this schema compatible with the original
pg_prometheus
schema and the v0.1 schema (which featured a single metrics
table with both value and labels).
Views within the schema are split into three groups: aggregated samples, materialized catalogs of the schema, and
compatibility with the pg_prometheus
schema.
Aggregate views are prefixed with agg_
and use TimescaleDB's continuous aggregates to occasionally refresh a
materialized view and aggregate samples into larger time buckets.
Catalog views are prefixed with cat_
and materialize expensive views of the metric labels, enriching them with
collected metadata.
Compatibility views ensure the schema is fully compatible with the pg_prometheus
extensions' schema, despite slightly
different underlying storage.