Skip to content

Add charting tutorial #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/content/docs/tutorials/map-data-to-ocsf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
title: Map Data to OCSF
---

In this tutorial you'll learn how to map events to [Open Cybersecurity Schema
Framework (OCSF)](https://schema.ocsf.io). We walk you through an example of
In this tutorial you'll learn how to **map events to [Open Cybersecurity Schema
Framework (OCSF)](https://schema.ocsf.io)**. We walk you through an example of
events from a network monitor and show how you can use Tenzir pipelines to
easily transform them so that they become OCSF-compliant events.

Expand Down
376 changes: 376 additions & 0 deletions src/content/docs/tutorials/plot-data-with-charts.mdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,376 @@
---
title: Plot Data with Charts
---

In this tutorial, you will learn how to **use pipelines to plot data as charts**.

The Tenzir Query Language (TQL) excels at slicing and dicing data. But turning
tabular results into actionable insights often calls for visualization. This is
where charts come into play.

![Charts](./plot-data-with-charts/charts.svg)

## Available chart types

Tenzir supports four types of charts, each with a dedicated operator:

1. **Pie**: [`chart_pie`](../../reference/operators/chart_pie)
2. **Bar**: [`chart_bar`](../../reference/operators/chart_bar)
3. **Line**: [`chart_line`](../../reference/operators/chart_line)
4. **Area**: [`chart_area`](../../reference/operators/chart_area)

## How to plot data

Plotting data in the Explorer involves three steps:

1. [Run a pipeline](../.../guides/usage/basics/run-pipelines) to prepare the
data.
2. Add a `chart_*` operator to render the plot.
3. View the chart below the Editor.

![Pipeline to Chart](./plot-data-with-charts/pipeline-to-chart.svg)

After generating a chart, you can **download it** or **add it to a dashboard**
to refresh it automatically.

### Download a chart

To download a chart:

{% steps %}

1. Click the download button in the top-right corner.

![Download Chart](./plot-data-with-charts/download-chart.png)

2. Choose **PNG** or **SVG** to save the chart as an image.

{% /steps %}

⬇️ You have now successfully save the chart to your computer. Enjoy.

### Add a chart to a dashboard

To make a chart permanent:

{% steps %}

1. Click the **Dashboard** button.

![Add Chart to Dashboard](./plot-data-with-charts/add-chart-to-dashboard-1.png)

2. Enter a title for the chart, then click **Add to Dashboard**.

![Add Chart Title](./plot-data-with-charts/add-chart-to-dashboard-2.png)

3. View the chart in your dashboard.

![Chart in Dashboard](./plot-data-with-charts/add-chart-to-dashboard-3.png)

{% /steps %}

🎉 Congratulations! Your chart is now saved and will automatically reload when
you open the dashboard.

## Master essential charting techniques

Now that you know how to create charts, let us explore some common techniques to
enhance your charting skills.

### Plot counters as bar chart

A good use case for bar charts is visualization of counters of categorical
values, because comparing bar heights is an effective way to gain a relative
understanding of the data at hand.

{% steps %}

1. **Shape your data**: Suppose you want to create a bar chart showing the
outcomes of coin flips. First, generate a few observations:

```tql
from {}
repeat 20
set outcome = "heads" if random().round() == 1 else "tails"
summarize outcome, n=count()
```

Sample output:

```tql
{outcome: "tails", n: 9}
{outcome: "heads", n: 11}
```

2. **Plot the data**: Add the [`chart_bar`](../../reference/operators/chart_bar)
operator to visualize the counts.

Map the outcome and count fields to the x-axis and y-axis:

```tql
from {outcome: "tails", n: 9},
{outcome: "heads", n: 11}
chart_bar x=outcome, y=n
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

{% /steps %}

##### Group and stack bars

Sometimes, your data has a third dimension. You can **group** multiple series
into a single plot.

Example with a `time` dimension:

```tql
from {outcome: "tails", n: 9, time: "Morning"},
{outcome: "heads", n: 11, time: "Morning"},
{outcome: "tails", n: 14, time: "Afternoon"},
{outcome: "heads", n: 15, time: "Afternoon"},
{outcome: "tails", n: 4, time: "Evening"},
{outcome: "heads", n: 12, time: "Evening"}
chart_bar x=outcome, y=n, group=time
```

_(TODO: Insert grouped bar chart.)_

To **stack** the grouped bars, add `position="stacked"`:

```tql
from {outcome: "tails", n: 9, time: "Morning"},
{outcome: "heads", n: 11, time: "Morning"},
{outcome: "tails", n: 14, time: "Afternoon"},
{outcome: "heads", n: 15, time: "Afternoon"},
{outcome: "tails", n: 4, time: "Evening"},
{outcome: "heads", n: 12, time: "Evening"}
chart_bar x=outcome, y=n, group=time, position="stacked"
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

#### Scale the y-axis logarithmically

If your data spans several orders of magnitude, **log scaling** can make smaller
values visible.

Example without log scaling:

```tql
from {outcome: "A", n: 3},
{outcome: "B", n: 5},
{outcome: "C", n: 10},
{outcome: "D", n: 21},
{outcome: "E", n: 10000}
chart_bar x=outcome, y=n
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

The large value (`E`) dominates the chart, hiding the smaller categories.

Enable log scaling via `y_log=true` to reveal them:

```tql
from {outcome: "A", n: 3},
{outcome: "B", n: 5},
{outcome: "C", n: 10},
{outcome: "D", n: 21},
{outcome: "E", n: 10000}
chart_bar x=outcome, y=n, y_log=true
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

Now, you can clearly see all the values! 👀

{% aside type="caution" title="Interpreting Log-Scaled Plots" %}
Log scaling removes linearity. Comparing bar heights no longer reflects a simple
numeric ratio. Stacked values are not additive anymore.
{% /aside %}

### Plot compositions as pie chart

Pie charts are well-understood and frequently occur in management dashboards.
Let's plot some synthetic data with the
[`chart_pie`](../../reference/operators/chart_pie) operator:

```tql
from {category: "A", percentage: 40},
{category: "B", percentage: 25},
{category: "C", percentage: 20},
{category: "D", percentage: 10},
{category: "E", percentage: 5}
chart_pie label=category, value=percentage
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

{% aside type="note" title="Bar Charts > Pie Charts" %}
**Use bar charts when you can and pie charts when you must**. Why? Pie charts
are often considered inferior to bar charts because they rely on human
perception of angles, which is less accurate than judging lengths, making
comparisons between categories harder. Bar charts allow for quick, precise
comparisons, handle many categories cleanly, and often do not require labels,
while pie charts become cluttered and confusing with numerous slices. Bar charts
also allow easy sorting, are more space-efficient, and tend to stay cleaner
without unnecessary visual distractions like 3D effects. Overall, bar charts
communicate data more clearly, accurately, and efficiently than pie charts.
{% /aside %}

### Plot metrics as line chart

Line charts come in handy when visualizing data trends over a continuous scale,
such as time series data.

{% steps %}

1. **Shape your data**: For our line chart demo, we'll use some internal node
metrics provided by the [`metrics`](../reference/operators/metrics) operator.
Let's look at the RAM usage of the node:

```tql
metrics "process"
drop swap_space, open_fds
head 3
```

```
{timestamp: 2025-04-27T18:16:17.692Z, current_memory_usage: 2363461632, peak_memory_usage: 4021136}
{timestamp: 2025-04-27T18:16:18.693Z, current_memory_usage: 2366595072, peak_memory_usage: 4021136}
{timestamp: 2025-04-27T18:16:19.694Z, current_memory_usage: 2385154048, peak_memory_usage: 4021136}
```

2. **Plot the data**: Add the
[`chart_line`](../../reference/operators/chart_line) operator to visualize
the time series. We are going to plot the memory usage within the last day:

```tql
metrics "process"
where timestamp > now() - 1d
chart_line x=timestamp, y=current_memory_usage
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

2. **Aggregate to reduce the resolution**: Plotting metrics with a 1-second
granularity over the course of a full day can make a line chart very noisy.
In fact, we have a total of 86,400 samples in our plot. This can make a line
chart quickly illegible. Let's reduce the noise by aggregating the samples
into 15-min buckets:

```tql
metrics "process"
where timestamp > now() - 1d
set timestamp = timestamp.round(15min)
summarize timestamp, mem=mean(current_memory_usage)
chart_line x=timestamp, y=mem
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

This looks a lot smoother! You can go one step further and move the
aggregation into the chart operator call:

```tql
metrics "process"
where timestamp > now() - 1d
chart_line x=timestamp, y=mean(current_memory_usage), resolution=15min
```

Not only does this make the pipeline more succinct, it also takes care of
some subtle issues:

1. A
2. B
3. C

{% /steps %}

#### Compare multiple series

Our metrics data not only includes the current memory usage but also peak usage.
Comparing the these two in the same chart helps us understand potentially
dangerous spikes. Let's add that second series to the y-axis by upgrading from a
single value to a record that represents the series.

```tql
metrics "process"
where timestamp > now() - 1d
chart_line \
x=timestamp,
y={current: mean(current_memory_usage), peak: max(peak_memory_usage * 1Ki)},
resolution=15min
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

Because `current_memory_usage` comes in gigabytes and `peak_memory_usage` in
megabytes, we cannot compare them directly. Hence we normalized the peak
usage to gigabytes to make them comparable in a single plot.

### Plot distributions as area chart

Area charts are fantastic for visualizing quantities that accumulate over a
continuous variable, such as time or value ranges. They are similar to line
charts but emphasize the volume underneath the line.

In the above section about line charts, you can exchange every call to
[`chart_line`](../../reference/operators/chart_line) with
[`chart_area`](../../reference/operators/chart_area) and will get a working
plot.

```tql
from {time: 1, a: 10, b: 20},
{time: 2, a: 8, b: 25},
{time: 3, a: 14, b: 30},
{time: 4, a: 10, b: 25},
{time: 5, a: 18, b: 40}
chart_area x=time, y={a: a, b: b}
```
The area under the curve gives you a strong visual impression of the total
event volume over time.

#### Stack multiple series

Like bar charts, area charts can display **stacked series**. This means that the
values of the series add up, helping you *compare contributions* from different
groups while still highlighting the overall cumulative shape.

Pass `position="stacked" to see the difference`:

```tql
from {time: 1, a: 10, b: 20},
{time: 2, a: 8, b: 25},
{time: 3, a: 14, b: 30},
{time: 4, a: 10, b: 25},
{time: 5, a: 18, b: 40}
chart_area x=time, y={a: a, b: b}, position="stacked"
```

{% aside type="danger" title="TODO" %}
Insert chart here.
{% /aside %}

Notice the difference in the y-axis interpretation:

- Without stacking, the areas *overlap* each other.
- With stacking, the areas become *disjoint* and *cumulatively add up* to the
total height.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions src/content/docs/tutorials/plot-data-with-charts/charts.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions src/sidebar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ export const guides = [

export const tutorials = [
'tutorials/map-data-to-ocsf',
'tutorials/plot-data-with-charts',
'tutorials/write-a-package',
];

Expand Down