Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Columnar encoding for the OpenTelemetry protocol #171

Merged
merged 142 commits into from
Jun 29, 2023
Merged
Changes from 1 commit
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
95da4ae
Intro + Motivation sections
lquerel Mar 14, 2021
aec7656
Move 0000-multivariate-timeseries.md in metrics folder
lquerel Mar 14, 2021
a3fbdb7
Add conclusion to the motivation section
lquerel Mar 14, 2021
6b550f4
Update 0000-multivariate-timeseries.md
lquerel Mar 14, 2021
3496ca1
Updated the "Explanation" section
lquerel Mar 21, 2021
56b3357
Added a diagram to present the data model
lquerel Mar 21, 2021
d760154
Added diagram
lquerel Mar 21, 2021
589dc8b
Update 0000-multivariate-timeseries.md
lquerel Mar 21, 2021
a516962
Examples of multivariate time-series
lquerel Mar 21, 2021
ecf45d2
Update 0000-multivariate-timeseries.md
lquerel Mar 21, 2021
58e9a53
Merge branch 'open-telemetry:main' into main
lquerel May 18, 2021
c809b1b
Update OTEP
lquerel May 18, 2021
150d987
Update OTEP
lquerel May 18, 2021
43340ea
Update OTEP
lquerel May 18, 2021
1448c91
Update OTEP
lquerel May 18, 2021
0db08ec
Update OTEP
lquerel May 18, 2021
f3739cf
Update OTEP
lquerel May 18, 2021
2218179
Merge branch 'open-telemetry:main' into main
lquerel Aug 10, 2021
ec813fe
Create OTEP 0156
lquerel Aug 10, 2021
8c34a91
Add columnar encoding benefits
lquerel Aug 10, 2021
e3c38db
Complete explanation section
lquerel Aug 10, 2021
68b4ed6
Create event.proto section
lquerel Aug 10, 2021
e25b3c6
Create internal details
lquerel Aug 10, 2021
07714b1
Update 0156-columnar-encoding.md
lquerel Aug 10, 2021
e82db1a
Add images for OTEP-0156
lquerel Aug 10, 2021
d9a621a
Add corner cases
lquerel Aug 10, 2021
8c7333e
Update first draft for review
lquerel Aug 10, 2021
91f9561
Remove initial proposal focusing only on the multivariate time-series…
lquerel Aug 10, 2021
c58a4ad
Merge remote-tracking branch 'origin/main'
lquerel Aug 10, 2021
52a85f7
Change Open Telemetry to OpenTelemetry
lquerel Aug 11, 2021
0b3f02d
Rephrases few sentences in the motivation section
lquerel Aug 11, 2021
e8c6ef1
Removed trailing-spaces to comply with markdown linter
lquerel Aug 11, 2021
dc74dd3
Fixed typo
lquerel Aug 11, 2021
fffce33
Fixed more markdown issues
lquerel Aug 11, 2021
72768c4
Add mapping OTEL metrics, logs, traces to Apache Arrow Schema
lquerel Aug 11, 2021
8695591
More explanation on the Arrow mapping and the memory layout.
lquerel Aug 11, 2021
9139969
Fixed markdown issues.
lquerel Aug 11, 2021
6499c63
Micro update to trigger the CLA checker again.
lquerel Aug 12, 2021
aa9c513
Update motivation section based on feedback from @jmacd
lquerel Aug 12, 2021
f0c6b90
Add some additional clarifications to @tigrannajaryan's feedback
lquerel Aug 13, 2021
fce05af
Span id and trace id are now nullable fields (as suggested by @jsuere…
lquerel Sep 18, 2021
33ea57d
Replaced label by attribute (as suggested by @jmacd and @jsuereth)
lquerel Sep 18, 2021
3d636d5
Fix markdown issues
lquerel Sep 21, 2021
19653e1
Add field declaration syntax based on @jmacd comment (https://github.…
lquerel Sep 21, 2021
81eb951
Merge branch 'main' into main
carlosalberto Nov 8, 2021
0590cb8
Update OTEP 0156 - Full protocol/mapping spec + Benchmark and more.
lquerel Jun 10, 2022
8449b67
Merge branch 'open-telemetry:main' into main
lquerel Jun 10, 2022
0be8ae3
Merge branch 'main' of https://github.com/lquerel/oteps
lquerel Jun 10, 2022
ab93d23
Fix markdown lint issue
lquerel Jun 11, 2022
f48c952
Fix markdown lint issue
lquerel Jun 11, 2022
0e6b9ff
Fix markdown lint issue
lquerel Jun 11, 2022
577b4bd
Merge branch 'open-telemetry:main' into main
lquerel Dec 27, 2022
5b62001
Rename EventStream into ArrowStream
lquerel Dec 27, 2022
ec0a623
Merge remote-tracking branch 'origin/main'
lquerel Dec 27, 2022
c0d134d
Update protobuf specification and corresponding documentation.
lquerel Dec 27, 2022
4f311f8
Fix markdown lint issues
lquerel Dec 27, 2022
4baab68
Update attribute representation section.
lquerel Dec 27, 2022
6b3d916
Update Metrics Payload section.
lquerel Dec 28, 2022
3cc3458
Update Logs and Traces Payload sections.
lquerel Dec 28, 2022
f24cb9c
Update Implementation Recommendations, Trade-offs and Mitigations, Pr…
lquerel Dec 28, 2022
efa0aad
Add links to the reference implementation.
lquerel Dec 28, 2022
b8cbf55
Update links to the reference implementation.
lquerel Dec 28, 2022
b3b00c0
Remove appendix D.
lquerel Dec 28, 2022
b4a95df
Add comment on data sharing.
lquerel Dec 28, 2022
8271a65
Fix markdown issues.
lquerel Dec 28, 2022
50557bc
Fix markdown issues.
lquerel Dec 28, 2022
e0dae99
Update Arrow Schemas based on Matt Topol feedback (Go Arrow committer).
lquerel Jan 6, 2023
321bbc5
Arrow Sparse vs Dense union.
lquerel Jan 7, 2023
50d4cd3
Use dictionary to represent small protobuf enum.
lquerel Jan 7, 2023
5f1e2d8
Fix navigation issue
lquerel Jan 7, 2023
9e38c47
Add a justification behind the compression field in the protobuf mess…
lquerel Jan 12, 2023
44b2743
Fix markdown-lint issue
lquerel Jan 12, 2023
198f114
Add link to ZSTD dictionary optimization.
lquerel Jan 12, 2023
778cf0a
Fix markdown-lint issue again...
lquerel Jan 12, 2023
e520b03
Add paragraph "unary RPC vs stream RPC"
lquerel Jan 12, 2023
3952c52
Improve Zero-copy argumentation
lquerel Jan 12, 2023
21cd665
Improve the argument around `otlp_arrow_payloads` which is a repeated…
lquerel Jan 13, 2023
ef2c09f
Improve RecordBatch description.
lquerel Jan 13, 2023
ed78a51
Improve paragraph "Dense vs Sparse union"
lquerel Jan 13, 2023
1524d84
Apply Joshua MacDonald's updates (thanks @jmacd)
lquerel Jan 13, 2023
f0dc753
Fix markdown lint issues.
lquerel Jan 13, 2023
8e65c10
Remove delivery_type and dictionaries attributes.
lquerel Jan 14, 2023
3f73d39
Specify compression algo in Validation section
lquerel Jan 17, 2023
c78c78e
Remove compression field from OtlpArrowPayload
lquerel Jan 18, 2023
20b4e30
Fix broken link (img)
lquerel Jan 18, 2023
c5c747c
Explain gains bandwidth+speed
lquerel Jan 21, 2023
87b5352
Fix markdown issue
lquerel Jan 21, 2023
d648aba
Fix markdown issue
lquerel Jan 21, 2023
942c98e
Fix markdown issue
lquerel Jan 21, 2023
d5e6a89
Fix terminology
lquerel Jan 23, 2023
4a10df3
Update charts with last ref. impl.
lquerel Feb 18, 2023
f0dd638
Fix charts with last ref. impl. + update description
lquerel Feb 18, 2023
1f0e9fc
Fix charts with last ref. impl. + update description
lquerel Feb 18, 2023
4454d02
Fix charts with last ref. impl. + update description
lquerel Feb 18, 2023
efeb9e9
Fix markdown lints
lquerel Feb 18, 2023
eade6b5
Merge branch 'main' into main
lquerel Apr 26, 2023
e6f7dc6
rename OTLP Arrow to OTel Arrow
lquerel May 30, 2023
1fde108
update proto file
lquerel May 30, 2023
8a2b713
update multivariate paragraph
lquerel May 30, 2023
686467c
update compression ratio summary
lquerel May 30, 2023
e08d43d
update grpc service/protocol section
lquerel May 30, 2023
110990b
update protobuf
lquerel May 30, 2023
b9e8452
Add ER diagrams for metrics, logs, and traces
lquerel May 30, 2023
6786b8c
Update `Logs Payload` section
lquerel May 30, 2023
df28655
Update `Spans Arrow Mapping` section
lquerel May 30, 2023
e73155d
Update `Metrics Arrow Mapping` section
lquerel May 30, 2023
84dac53
Update `Mapping OTel Entities to Arrow Records` section
lquerel May 30, 2023
1b78ba6
Update 3-columns chart benchmark.
lquerel May 31, 2023
8f82586
Update 3-columns stacked bar benchmark.
lquerel Jun 1, 2023
72fbeba
Update collector diagrams.
lquerel Jun 1, 2023
fb17f4b
Simplify benchmark section.
lquerel Jun 1, 2023
2aa2047
update benchmark section.
lquerel Jun 1, 2023
21bd7cb
fix markdown issues
lquerel Jun 1, 2023
621475f
fix markdown issues
lquerel Jun 1, 2023
4bed578
Merge branch 'main' into main
lquerel Jun 1, 2023
09749cc
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
549061c
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
64f4a1d
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
ea6fd96
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
ec77368
Update OTEP based on Tigran's comments.
lquerel Jun 12, 2023
6c2e9b5
Merge branch 'main' of https://github.com/lquerel/oteps
lquerel Jun 12, 2023
ce274de
Fix markdown lint issue
lquerel Jun 12, 2023
df06cea
Update OTEP based on Tigran's comments
lquerel Jun 12, 2023
f83a378
Change batch_id type from string to int64
lquerel Jun 20, 2023
9bdeab2
Rename sub_stream_id to schema_id (see https://github.com/open-teleme…
lquerel Jun 20, 2023
d5e8966
Create a new section Future Possibilities (see https://github.com/ope…
lquerel Jun 20, 2023
fae225a
Fix typo
lquerel Jun 20, 2023
1892d31
Minor change (request -> service)
lquerel Jun 20, 2023
5512264
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
560ca64
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
9f68a84
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
34fd4df
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
25bf657
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
7fbd269
Update based on @atoulme comments
lquerel Jun 26, 2023
0c3742d
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
0ffddbf
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
e5c64e9
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
0f1faf1
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
f26c725
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
ad06693
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
a924a99
Add delta-delta encoding link
lquerel Jun 26, 2023
5bd1d00
Merge branch 'main' into main
reyang Jun 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update Metrics Arrow Mapping section
  • Loading branch information
lquerel committed May 30, 2023
commit e73155df136f2584882f3b8e0af398d608c2b22e
235 changes: 24 additions & 211 deletions text/0156-columnar-encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ expose the new gRPC endpoint and to provide OTel Arrow support via the previous
* [Mapping OTel Entities to Arrow Records](#mapping-otel-entities-to-arrow-records)
* [Logs Arrow Mapping](#logs-arrow-mapping)
* [Spans Arrow Mapping](#spans-arrow-mapping)
* [Metrics Payload](#metrics-payload)
* [Metrics Arrow Mapping](#metrics-arrow-mapping)
* [Implementation Recommendations](#implementation-recommendations)
* [Protocol Extension and Fallback Mechanism](#protocol-extension-and-fallback-mechanism)
* [Batch ID Generation](#batch-id-generation)
Expand Down Expand Up @@ -601,228 +601,41 @@ Similarly, each of the Arrow records is sorted by specific columns to optimize t
The `end_time_unix_nano` is represented as a duration (`end_time_unix_nano` - `start_time_unix_nano`) to reduce the
number of bits required to represent the timestamp.

#### Metrics Payload
#### Metrics Arrow Mapping

The mapping for metrics, while being the most complex, fundamentally follows the same logic as applied to logs and
spans. The primary 'METRICS' entity encapsulates a flattened representation of `ResourceMetrics`, `ScopeMetrics`, and
`Metrics`. All common columns among the different metric types are consolidated in this main entity (i.e., `metric_type`,
`name`, `description`, `unit`, `aggregation_temporality`, and `is_monotonic`). Furthermore, a dedicated entity is
crafted to represent data points for each type of metrics, with their columns being specific to the respective metric
type. For instance, the `SUMMARY_DATA_POINTS` entity includes columns `id`, `parent_id`, `start_time_unix_nano`,
`time_unix_nano`, `count`, `sum`, and `flags`. Each of these "data points" entities is linked to:
- A set of data point attributes (following a one-to-many relationship).
- A set of data points exemplars (also adhering to a one-to-many relationship).

Exemplar entities, in turn, are connected to their dedicated set of attributes.

Technically speaking, the `quantile` entity isn't encoded as an independent entity but rather as a list of struct within
the `SUMMARY_DATA_POINTS entity`.

![Metrics Arrow Schema](img/0156_metrics_schema.png)

We start by defining the Arrow Schema of the `exemplar` concept because it is used for several types of metrics.

```yaml
# Exemplar Arrow Schema (declaration used in other schemas)
exemplars: &exemplars # arrow type = list of struct
- attributes: *attributes # YAML alias to the attributes schema defined previously
time_unix_nano: timestamp # arrow type = timestamp (time unit nanoseconds)
value: # arrow type = sparse union
i64: int64
f64: float64
span_id: 8_bytes_binary_dictionary | 8_bytes_binary # arrow fixed size binary array
trace_id: 16_bytes_binary_dictionary | 16_bytes_binary # arrow fixed size binary array
```
Gauge and Sum are identified by the `metric_type` column in the `METRICS` entity and they share the same Arrow record
for the data points, i.e. `NUMBER_DATA_POINTS`.

`span_id` and `trace_id` are represented as fixed size binary dictionaries by default but can evolve to non-dictionary
form when their cardinality exceeds a certain threshold (usually 2^16).

As usual, each of these Arrow records is sorted by specific columns to optimize the compression ratio. With this mapping
batch of metrics containing a large number of data points sharing the same attributes and timestamp will be highly
compressible (multivariate time-series scenario).

> Note: every OTLP timestamps are represented as Arrow timestamps with nanoseconds time unit. This representation will
lquerel marked this conversation as resolved.
Show resolved Hide resolved
> simplify the integration with the rest of the Arrow ecosystem (numerous time/date functions are supported in
> DataFusion for example).

The Arrow Schema for the univariate metrics is the following:

```yaml
resource_metrics:
- resource:
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string_dictionary | string
scope_metrics:
- scope:
name: string_dictionary | string
version: string_dictionary | string
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string_dictionary | string
# This section represents the standard OTLP metrics as defined in OTel v1
# specifications.
#
# Named univariate metrics as their representation allow to represent each
# metric as independent measurement with their own specific timestamps and
# attributes.
#
# Shared attributes and timestamps are optional and only used for optimization
# purposes.
univariate_metrics: # arrow type = list
- name: string_dictionary | string # required, arrow type = struct
description: string_dictionary | string
unit: string_dictionary | string
shared_attributes: *attributes # attributes inherited by data points if not defined locally
shared_start_time_unix_nano: timestamp # start time inherited by data points if not defined locally
shared_time_unix_nano: timestamp # required if not defined in data points
data: # arrow type = sparse union
gauge: # arrow type = struct
data_points:
- attributes: *attributes
start_time_unix_nano: timestamp # arrow type = timestamp (time unit nanoseconds)
time_unix_nano: timestamp # required if not defined as a shared field in the metric
value: # arrow type = sparse union
i64: int64
f64: float64
exemplars: *exemplars
flags: uint32 # each flag defined in this enum is a bit-mask
sum: # arrow type = struct
data_points:
- attributes: *attributes
start_time_unix_nano: timestamp
time_unix_nano: timestamp # required
value: # arrow type = sparse union
i64: int64
f64: float64
exemplars: *exemplars
flags: uint32 # each flag defined in this enum is a bit-mask
aggregation_temporality: uint8_dictionary # OTLP enum with 3 variants
is_monotonic: bool
summary: # arrow type = struct
data_points:
- attributes: *attributes
start_time_unix_nano: timestamp
time_unix_nano: timestamp # required
count: uint64
sum: float64
quantile: # arrow type = list of struct
- quantile: float64
value: float64
flags: uint32 # each flag defined in this enum is a bit-mask
histogram: # arrow type = struct
data_points:
- attributes: *attributes
start_time_unix_nano: timestamp
time_unix_nano: timestamp
count: uint64
sum: float64
bucket_counts: []uint64
explicit_bounds: []float64
min: float64
max: float64
exemplars: *exemplars
flags: uint32 # each flag defined in this enum is a bit-mask
aggregation_temporality: int32
exp_histogram: # arrow type = struct
data_points:
- attributes: *attributes
start_time_unix_nano: timestamp
time_unix_nano: timestamp
count: uint64
sum: float64
scale: int32
zero_count: uint64
positive:
offset: int32
bucket_counts: []uint64
negative:
offset: int32
bucket_counts: []uint64
min: float64
max: float64
exemplars: *exemplars
flags: uint32 # each flag defined in this enum is a bit-mask
aggregation_temporality: uint8_dictionary # OTLP enum with 3 variants
```

`Gauge`, `Sum`, `Histogram`, `Exponential Histogram`, and `Summary` are represented as Arrow Sparse Union of structs.
Additional variants can be added in the future.

> Note: `aggregation_temporality` is represented as an Arrow dictionary with a dictionary index of type int8. This OTLP
> enum has current 3 variants, and we don't expect to have in the future more than 2^8 variants.

The Arrow Schema for the native multivariate metrics is the following:

```yaml
resource_metrics:
- resource:
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string | string_dictionary
scope_metrics:
- scope:
name: string | string_dictionary
version: string | string_dictionary
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string | string_dictionary
# Native support of multivariate metrics (not yet implemented)
#
# Multivariate metrics are related metrics sharing the same context, i.e. the same
# attributes and timestamps.
#
# Each metrics is defined by a name, a set of data points, and optionally a description
# and a unit.
multivariate_metrics:
attributes: *attributes # All multivariate metrics shared the same attributes
start_time_unix_nano: timestamp # All multivariate metrics shared the same timestamps
time_unix_nano: timestamp # required
metrics: # arrow type = list of sparse union
- gauge: # arrow type = struct
name: string | string_dictionary # required
description: string | string_dictionary
unit: string | string_dictionary
value: # arrow type = dense union
i64: int64
f64: float64
exemplars: *exemplars
flags: uint32
sum: # arrow type = struct
name: string | string_dictionary # required
description: string | string_dictionary
unit: string | string_dictionary
value: # arrow type = dense union
i64: int64
f64: float64
exemplars: *exemplars
flags: uint32
aggregation_temporality: uint8_dictionary # OTLP enum with 3 variants
is_monotonic: bool
summary: # arrow type = struct
name: string | string_dictionary # required
description: string | string_dictionary
unit: string | string_dictionary
count: uint64
sum: float64
quantile:
- quantile: float64
value: float64
flags: uint32
histogram: # arrow type = struct
name: string | string_dictionary # required
description: string | string_dictionary
unit: string | string_dictionary
count: uint64
sum: float64
bucket_counts: []uint64
explicit_bounds: []float64
exemplars: *exemplars
flags: uint32
min: float64
max: float64
aggregation_temporality: int32
exp_histogram: # arrow type = struct
name: string | string_dictionary # required
description: string | string_dictionary
unit: string | string_dictionary
count: uint64
sum: float64
scale: int32
zero_count: uint64
positive:
offset: int32
bucket_counts: []uint64
negative:
offset: int32
bucket_counts: []uint64
exemplars: *exemplars
flags: uint32
min: float64
max: float64
aggregation_temporality: uint8_dictionary # OTLP enum with 3 variants
```
> enum has currently 3 variants, and we don't expect to have in the future more than 2^8 variants.

## Implementation Recommendations

Expand Down