Skip to content

Commit

Permalink
Promscale - Add Overview & Getting Started (github#478)
Browse files Browse the repository at this point in the history
* File setup

* Index metadata

* Add features

* Placeholder for data types

* Update the about page with new positioning and features

* initial draft

* Index

* Metadata

* Move dupe install page

* Install index

* prom-migrator

* Moving content

* fix promscale section rendering issue in index.js

* Docker

* Merge branch 'latest' into promscale-overview-lana

* Add missing /procedures

* Helm

* Add missing prometheus heklm chart procedure

* Source

* Move tutorial content to top level

* Update index

* comma

* benefits

* How it works -> about

* Update index

* Update index

* run queries - part 1

* Fixed FIXME's & added content to tobs.md

* Run queries

* Move grafana content to new viz chapter

* tobs

* Changes per feedback

* Chanegs per feedback

* Apply suggestions from code review

Co-authored-by: James Guthrie <JamesGuthrie@users.noreply.github.com>

* Update index content to match our new positioning

* iterate over review comments

* add tobs from 605 PR

* Keep only the overall benefits section

* add tobs from 605 PR

* Move tobs concept content up a level

* Move tobs content

* Reflow tobs

* Update index

* Edits to Helm section

* fix nit in helm

* Apply suggestions from code review

Co-authored-by: James Guthrie <JamesGuthrie@users.noreply.github.com>

* Apply suggestions from code review

* Remove outdated introduction to Promscale. What Promscale is is already covered in the about page

* Update install page structure and improve instructions

* Iterating keenly by reading everything

* tyop

* index fixes

* install index

* Move config to own folder

* config

* docker

* Prometheus

* k8s

* prom-migrator

* tobs

* fix the local build issues

* Fix important note style

* fix tobs outline links

* Minor updates to Promscale initial page

* Several additional updates to complete the review

* Update promscale/page-index/page-index.js

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/page-index/page-index.js

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/query-data.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/index.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/opentelemetry.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Update promscale/send-data/prometheus.md

Co-authored-by: Lana Brindley <github@lanabrindley.com>

* Small consistency change

Co-authored-by: Ramon Guiu <ramon@timescale.com>
Co-authored-by: Vineeth Pothulapati <vineethpothulapati@outlook.com>
Co-authored-by: James Guthrie <JamesGuthrie@users.noreply.github.com>
  • Loading branch information
4 people authored Dec 21, 2021
1 parent 64026dc commit 2bd2294
Show file tree
Hide file tree
Showing 25 changed files with 1,511 additions and 1,078 deletions.
224 changes: 224 additions & 0 deletions promscale/about-promscale.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
# About Promscale
Promscale is an open source observability backend for metrics and traces
powered by SQL.

It's built on the robust and high-performance foundation of PostgreSQL and
TimescaleDB. It has native support for Prometheus metrics and OpenTelemetry
traces as well as many other formats like StatsD, Jaeger and Zipkin through the
OpenTelemetry Collector and is [100% PromQL compliant][promlabs-test]. It's full
SQL capabilities enable developers to correlate metrics, traces and also
business data to derive new valuable insights not possible when data is siloed
in different systems. It easily integrates with Grafana and Jaeger for
visualizing metrics and traces.

Built on top of PostgreSQL and TimescaleDB it inherits rock-solid reliability,
native compression up to 90%, continuous aggregates and the operational maturity
of a system that is run on millions of instances worldwide.

For the Promscale source code, see our [GitHub repository][gh-promscale].

If you have any questions, join the `#promscale` channel on the
[TimescaleDB Community Slack][slack].

## Architecture
Promscale includes two components:

**Promscale Connector**: a stateless service that provides the ingest interfaces
for observability data, processes that data and stores it in TimescaleDB. It
also provides an interface to query the data with PromQL. The Promscale
Connector automatically sets up the data structures in TimescaleDB to store the
data and handles changes in those data structures if required for
upgrading to newer versions of Promscale.

**TimescaleDB**: the Postgres-based database where all the observability data is
stored. It offers a full SQL interface for querying the data as well as advanced
capabilities like analytical functions, columnar compression and continuous
aggregates. TimescaleDB offers a lot of flexibility to also store business and
other types of data that you can then use to correlate with observability data.

<img class="main-content__illustration" src="https://s3.amazonaws.com/assets.timescale.com/docs/images/promscale-arch.png" alt="Promscale architecture diagram"/>

The Promscale Connector ingests Prometheus metrics, metadata and OpenMetrics
exemplars using the Prometheus `remote_write` interface. It also ingests
OpenTelemetry traces using the OpenTelemetry protocol (OTLP). It can also ingest
metrics and traces in other formats using the OpenTelemetry Collector to process
and send them over the Prometheus `remote_write` interface and the OpenTelemetry
protocol. For example, you can use the OpenTelemetry Collector to ingest Jaeger
traces and StatsD metrics into Promscale.

For Prometheus metrics, the Promscale Connector exposes Prometheus API endpoints
for running PromQL queries and reading metadata. This allows you to connect
tools that support the Prometheus API, such as Grafana, directly to Promscale
for querying. It's also possible to send queries to Prometheus and have
Prometheus read data from Promscale using the Promscale Connector on the
`remote_read` interface.

For OpenTelemetry traces, there is a Jaeger storage plugin that implements the
interface for querying and retrieving traces. This allows you to visualize
traces stored in Promscale in Jaeger as well as Grafana by configuring a Jaeger
data source. In this case, Grafana queries Jaeger, which then queries Promscale.

You can also query metrics and traces in Promscale using SQL which allows you to
use many different visualization tools that integrate with PostgreSQL. For
example, Grafana supports querying data in Promscale using SQL out of the box
through the PostgreSQL data source.

## Promscale PostgreSQL extension
Promscale has a dependency on the
[Promscale PostgreSQL extension][promscale-extension], which contains support
functions to improve the performance of Promscale. While Promscale is able to
run without the additional extension installed, adding this extension gets
better performance from Promscale.

## Promscale schema for metric data
To achieve high ingestion, query performance, and optimal storage the Promscale
schema writes the data in the most optimal format for storage and querying in
TimescaleDB. Promscale translates data from the
[Prometheus data model][Prometheus native format] into a relational schema that
is optimized for TimescaleDB.

The basic schema uses a normalized design where time-series data is stored in
compressed hypertables. These tables have a foreign key to series tables that
are stored as regular PostgreSQL tables, and each series consists of a unique
set of labels.

For more information about compression, see the
[compression section][tsdb-compression]. For more information about hypertables,
see the [hypertables section][tsdb-hypertables].

### Metrics storage schema
Each metric is stored in a separate hypertable. In particular, the schema
decouples individual metrics, allowing for the collection of metrics with vastly
different cardinalities and retention periods. At the same time, Promscale
exposes simple, user-friendly views so that you do not have to understand this
optimized schema.

The latest chunk is decompressed to serve as a high-speed query cache. Older
chunks are stored as compressed chunks. We configure compression with the
`segment_by` column set to the `series_id` and the `order_by` column set to time
DESC. These settings control how data is split into blocks of compressed data.
Each block can be accessed and decompressed independently.

These settings mean that a block of compressed data is always associated with a
single `series_id` and that the data is sorted by time before being split into
blocks. This means each block is associated with a fairly narrow time range. As
a result, in compressed form, access by `series_id` and time range are
optimized.

For example, the hypertables for each metric use the following schema, using `cpu_usage` as an example metric:

The `cpu_usage` table schema:
```sql
CREATE TABLE cpu_usage (
time TIMESTAMPTZ,
value DOUBLE PRECISION,
series_id BIGINT,
)
CREATE INDEX ON cpu_usage (series_id, time) INCLUDE (value)
```

```sql
Column | Type | Modifiers
-----------+--------------------------+-----------
time | TIMESTAMPTZ |
value | DOUBLE PRECISION |
series_id | BIGINT |
```

In this example, `series_id` is a foreign key to the `series` table described in the next section.

### Series storage schema
Conceptually, each row in the series table stores a set of key-value pairs. In
Prometheus, a series like this is represented as a one-level JSON string, such
as `{ "key1":"value1", "key2":"value2" }`. But the strings representing keys and
values are often long and repeating. So, to save space, we store a series as an
array of integer `foreign keys` to a normalized labels table.

The definition of these two tables is:
```sql
CREATE TABLE _prom_catalog.series (
id serial,
metric_id int,
labels int[],
UNIQUE(labels) INCLUDE (id)
);
CREATE INDEX series_labels_id ON _prom_catalog.series USING GIN (labels);

CREATE TABLE _prom_catalog.label (
id serial,
key TEXT,
value text,
PRIMARY KEY (id) INCLUDE (key, value),
UNIQUE (key, value) INCLUDE (id)
);
```

### Promscale views
You interact with Prometheus data in Promscale through views. These views are
automatically created and are used to interact with metrics and labels.

Each metric and label has its own view. You can see a list of all metrics by
querying the view named `metric`. Similarly, you can see a list of all labels by
querying the view named `label`. These views are found in the `prom_info`
schema.

Querying the `metric` view returns all metrics collected by Prometheus:
```SQL
SELECT *
FROM prom_info.metric;
```

Here is one row of a sample output for the query shown earlier:
```
id | 16
metric_name | process_cpu_seconds_total
table_name | process_cpu_seconds_total
retention_period | 90 days
chunk_interval | 08:01:06.824386
label_keys | {__name__,instance,job}
size | 824 kB
compression_ratio | 71.60883280757097791800
total_chunks | 11
compressed_chunks | 10
```

Each row in the `metric` view contains fields with the metric `id`, as well as
information about the metric, such as its name, table name, retention period,
compression status, chunk interval etc.

Promscale maintains isolation between metrics. This allows you to set retention
periods, downsampling, and compression settings on a per metric basis, giving
you more control over your metrics data.

Querying the `label` view returns all labels associated with metrics collected
by Prometheus:
```SQL
SELECT *
FROM prom_info.label;
```

Here is one row of a sample output for the query shown earlier:
```
key | collector
value_column_name | collector
id_column_name | collector_id
values | {arp,bcache,bonding,btrfs,conntrack,cpu,cpufreq,diskstats,edac,entropy,filefd,filesystem,hwmon,infiniband,ipvs,loadavg,mdadm,meminfo,netclass,netdev,netstat,nfs,nfsd,powersupplyclass,pressure,rapl,schedstat,sockstat,softnet,stat,textfile,thermal_zone,time,timex,udp_queues,uname,vmstat,xfs,zfs}
num_values | 39
```

Each label row contains information about a particular label, such as the label
key, the label's value column name, the label's ID column name, the list of all
values taken by the label,and the total number of values for that label.

For examples of querying a specific metric view, see
[Query data in Promscale][query-data].


[gh-promscale]: https://github.com/timescale/promscale
[slack]: https://slack.timescale.com
[promscale-extension]: https://github.com/timescale/promscale_extension#promscale-extension
[Prometheus native format]: https://prometheus.io/docs/instrumenting/exposition_formats/
[query-data]: promscale/:currentVersion:/query-data
[promlabs-test]: https://promlabs.com/promql-compliance-test-results/2021-10-14/promscale
[tsdb-compression]: timescaledb/:currentVersion:/how-to-guides/compression/
[tsdb-hypertables]: timescaledb/:currentVersion:/how-to-guides/hypertables/
69 changes: 32 additions & 37 deletions promscale/index.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,33 @@
# Promscale
Promscale allows you to extract more meaningful insights from your metrics data.
It is an open source long-term store for Prometheus data designed for analytics.
Promscale is built on top of TimescaleDB, and is a horizontally scalable and
operationally mature platform for Prometheus data that uses PromQL and SQL to
allow you to ask any question, create any dashboard, and achieve greater
visibility into your systems.

Promscale has consistently been one of the only long-term stores for Prometheus
data that continues to maintain top performance. It received a 100% compliance
test score each time, with no cross-cutting concerns, from PromLab's [PromQL
Compliance Test Suite][promlabs].

For more information about Promscale, see our [blog post][promscale-blog], or
check out the [demo][promscale-demo]. If you have any questions, you can join
the Promscale channel on the [TimescaleDB Community Slack][slack].

## Promscale architecture
Prometheus writes data to the Promscale Connector using its `remote_write`
interface. The Connector writes data to TimescaleDB. PromQL queries can be
directed to the Connector, or to the Prometheus instance, which then reads data
from the Connector using the `remote_read` interface. The Connector then fetches
data from TimescaleDB. SQL queries can be directed to TimescaleDB directly.

<img class="main-content__illustration" src="https://s3.amazonaws.com/assets.timescale.com/docs/images/promscale-arch.png" alt="Promscale architecture diagram"/>

For a detailed description of this architecture, see our
[design document][design-doc].

For more documentation, see our [developer documentation][promscale-gh-docs].


[promscale-blog]: https://blog.timescale.com/blog/promscale-analytical-platform-long-term-store-for-prometheus-combined-sql-promql-postgresql/
[promscale-demo]: https://youtu.be/FWZju1De5lc
[slack]: https://slack.timescale.com/
[promlabs]: https://promlabs.com/promql-compliance-test-results/2020-12-01/promscale
[design-doc]: https://docs.google.com/document/d/1e3mAN3eHUpQ2JHDvnmkmn_9rFyqyYisIgdtgd3D1MHA/edit?usp=sharing
[promscale-gh-docs]: https://github.com/timescale/promscale/tree/master/docs
Promscale is the open source observability backend for metrics and traces
powered by SQL.

It is built on top of PosgreSQL and TimescaleDB and has native support for
Prometheus metrics (including 100% PromQL compliance) and OpenTelemetry traces.
It's full SQL capabilities enable developers to correlate metrics, traces and
also business data to derive new valuable insights not possible when data is
siloed in different systems.

* [Learn about Promscale][about-promscale] to understand how it works before
you begin using it.
* [Learn about Promscale benefits][promscale-benefits] to understand how it
can be useful in your environment.
* [Learn about Promscale installation][install-promscale] to understand how
to install using source, docker and kubernetes.
* [Learn about tobs][about-tobs] to understand how to install a complete
observability stack on Kubernetes.
* [Send metrics and traces][send-data] to Promscale
* Use Promscale to [run queries][query-data].
* Use Promscale with [visualization tools][visualize-data].

For more about Promscale, see our [developer documentation][promscale-gh-docs].


[about-promscale]: promscale/:currentVersion:/about-promscale
[install-promscale]: promscale/:currentVersion:/installation
[promscale-benefits]: promscale/:currentVersion:/promscale-benefits/
[query-data]: promscale/:currentVersion:/query-data/
[visualize-data]: promscale/:currentVersion:/visualize-data/
[promscale-gh-docs]: https://github.com/timescale/promscale/
[about-tobs]: promscale/:currentVersion:/tobs/
[send-data]: promscale/:currentVersion:/send-data/
85 changes: 0 additions & 85 deletions promscale/install-promscale.md

This file was deleted.

Loading

0 comments on commit 2bd2294

Please sign in to comment.