Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Columnar encoding for the OpenTelemetry protocol #171

Merged
merged 142 commits into from
Jun 29, 2023
Merged
Changes from 1 commit
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
95da4ae
Intro + Motivation sections
lquerel Mar 14, 2021
aec7656
Move 0000-multivariate-timeseries.md in metrics folder
lquerel Mar 14, 2021
a3fbdb7
Add conclusion to the motivation section
lquerel Mar 14, 2021
6b550f4
Update 0000-multivariate-timeseries.md
lquerel Mar 14, 2021
3496ca1
Updated the "Explanation" section
lquerel Mar 21, 2021
56b3357
Added a diagram to present the data model
lquerel Mar 21, 2021
d760154
Added diagram
lquerel Mar 21, 2021
589dc8b
Update 0000-multivariate-timeseries.md
lquerel Mar 21, 2021
a516962
Examples of multivariate time-series
lquerel Mar 21, 2021
ecf45d2
Update 0000-multivariate-timeseries.md
lquerel Mar 21, 2021
58e9a53
Merge branch 'open-telemetry:main' into main
lquerel May 18, 2021
c809b1b
Update OTEP
lquerel May 18, 2021
150d987
Update OTEP
lquerel May 18, 2021
43340ea
Update OTEP
lquerel May 18, 2021
1448c91
Update OTEP
lquerel May 18, 2021
0db08ec
Update OTEP
lquerel May 18, 2021
f3739cf
Update OTEP
lquerel May 18, 2021
2218179
Merge branch 'open-telemetry:main' into main
lquerel Aug 10, 2021
ec813fe
Create OTEP 0156
lquerel Aug 10, 2021
8c34a91
Add columnar encoding benefits
lquerel Aug 10, 2021
e3c38db
Complete explanation section
lquerel Aug 10, 2021
68b4ed6
Create event.proto section
lquerel Aug 10, 2021
e25b3c6
Create internal details
lquerel Aug 10, 2021
07714b1
Update 0156-columnar-encoding.md
lquerel Aug 10, 2021
e82db1a
Add images for OTEP-0156
lquerel Aug 10, 2021
d9a621a
Add corner cases
lquerel Aug 10, 2021
8c7333e
Update first draft for review
lquerel Aug 10, 2021
91f9561
Remove initial proposal focusing only on the multivariate time-series…
lquerel Aug 10, 2021
c58a4ad
Merge remote-tracking branch 'origin/main'
lquerel Aug 10, 2021
52a85f7
Change Open Telemetry to OpenTelemetry
lquerel Aug 11, 2021
0b3f02d
Rephrases few sentences in the motivation section
lquerel Aug 11, 2021
e8c6ef1
Removed trailing-spaces to comply with markdown linter
lquerel Aug 11, 2021
dc74dd3
Fixed typo
lquerel Aug 11, 2021
fffce33
Fixed more markdown issues
lquerel Aug 11, 2021
72768c4
Add mapping OTEL metrics, logs, traces to Apache Arrow Schema
lquerel Aug 11, 2021
8695591
More explanation on the Arrow mapping and the memory layout.
lquerel Aug 11, 2021
9139969
Fixed markdown issues.
lquerel Aug 11, 2021
6499c63
Micro update to trigger the CLA checker again.
lquerel Aug 12, 2021
aa9c513
Update motivation section based on feedback from @jmacd
lquerel Aug 12, 2021
f0c6b90
Add some additional clarifications to @tigrannajaryan's feedback
lquerel Aug 13, 2021
fce05af
Span id and trace id are now nullable fields (as suggested by @jsuere…
lquerel Sep 18, 2021
33ea57d
Replaced label by attribute (as suggested by @jmacd and @jsuereth)
lquerel Sep 18, 2021
3d636d5
Fix markdown issues
lquerel Sep 21, 2021
19653e1
Add field declaration syntax based on @jmacd comment (https://github.…
lquerel Sep 21, 2021
81eb951
Merge branch 'main' into main
carlosalberto Nov 8, 2021
0590cb8
Update OTEP 0156 - Full protocol/mapping spec + Benchmark and more.
lquerel Jun 10, 2022
8449b67
Merge branch 'open-telemetry:main' into main
lquerel Jun 10, 2022
0be8ae3
Merge branch 'main' of https://github.com/lquerel/oteps
lquerel Jun 10, 2022
ab93d23
Fix markdown lint issue
lquerel Jun 11, 2022
f48c952
Fix markdown lint issue
lquerel Jun 11, 2022
0e6b9ff
Fix markdown lint issue
lquerel Jun 11, 2022
577b4bd
Merge branch 'open-telemetry:main' into main
lquerel Dec 27, 2022
5b62001
Rename EventStream into ArrowStream
lquerel Dec 27, 2022
ec0a623
Merge remote-tracking branch 'origin/main'
lquerel Dec 27, 2022
c0d134d
Update protobuf specification and corresponding documentation.
lquerel Dec 27, 2022
4f311f8
Fix markdown lint issues
lquerel Dec 27, 2022
4baab68
Update attribute representation section.
lquerel Dec 27, 2022
6b3d916
Update Metrics Payload section.
lquerel Dec 28, 2022
3cc3458
Update Logs and Traces Payload sections.
lquerel Dec 28, 2022
f24cb9c
Update Implementation Recommendations, Trade-offs and Mitigations, Pr…
lquerel Dec 28, 2022
efa0aad
Add links to the reference implementation.
lquerel Dec 28, 2022
b8cbf55
Update links to the reference implementation.
lquerel Dec 28, 2022
b3b00c0
Remove appendix D.
lquerel Dec 28, 2022
b4a95df
Add comment on data sharing.
lquerel Dec 28, 2022
8271a65
Fix markdown issues.
lquerel Dec 28, 2022
50557bc
Fix markdown issues.
lquerel Dec 28, 2022
e0dae99
Update Arrow Schemas based on Matt Topol feedback (Go Arrow committer).
lquerel Jan 6, 2023
321bbc5
Arrow Sparse vs Dense union.
lquerel Jan 7, 2023
50d4cd3
Use dictionary to represent small protobuf enum.
lquerel Jan 7, 2023
5f1e2d8
Fix navigation issue
lquerel Jan 7, 2023
9e38c47
Add a justification behind the compression field in the protobuf mess…
lquerel Jan 12, 2023
44b2743
Fix markdown-lint issue
lquerel Jan 12, 2023
198f114
Add link to ZSTD dictionary optimization.
lquerel Jan 12, 2023
778cf0a
Fix markdown-lint issue again...
lquerel Jan 12, 2023
e520b03
Add paragraph "unary RPC vs stream RPC"
lquerel Jan 12, 2023
3952c52
Improve Zero-copy argumentation
lquerel Jan 12, 2023
21cd665
Improve the argument around `otlp_arrow_payloads` which is a repeated…
lquerel Jan 13, 2023
ef2c09f
Improve RecordBatch description.
lquerel Jan 13, 2023
ed78a51
Improve paragraph "Dense vs Sparse union"
lquerel Jan 13, 2023
1524d84
Apply Joshua MacDonald's updates (thanks @jmacd)
lquerel Jan 13, 2023
f0dc753
Fix markdown lint issues.
lquerel Jan 13, 2023
8e65c10
Remove delivery_type and dictionaries attributes.
lquerel Jan 14, 2023
3f73d39
Specify compression algo in Validation section
lquerel Jan 17, 2023
c78c78e
Remove compression field from OtlpArrowPayload
lquerel Jan 18, 2023
20b4e30
Fix broken link (img)
lquerel Jan 18, 2023
c5c747c
Explain gains bandwidth+speed
lquerel Jan 21, 2023
87b5352
Fix markdown issue
lquerel Jan 21, 2023
d648aba
Fix markdown issue
lquerel Jan 21, 2023
942c98e
Fix markdown issue
lquerel Jan 21, 2023
d5e6a89
Fix terminology
lquerel Jan 23, 2023
4a10df3
Update charts with last ref. impl.
lquerel Feb 18, 2023
f0dd638
Fix charts with last ref. impl. + update description
lquerel Feb 18, 2023
1f0e9fc
Fix charts with last ref. impl. + update description
lquerel Feb 18, 2023
4454d02
Fix charts with last ref. impl. + update description
lquerel Feb 18, 2023
efeb9e9
Fix markdown lints
lquerel Feb 18, 2023
eade6b5
Merge branch 'main' into main
lquerel Apr 26, 2023
e6f7dc6
rename OTLP Arrow to OTel Arrow
lquerel May 30, 2023
1fde108
update proto file
lquerel May 30, 2023
8a2b713
update multivariate paragraph
lquerel May 30, 2023
686467c
update compression ratio summary
lquerel May 30, 2023
e08d43d
update grpc service/protocol section
lquerel May 30, 2023
110990b
update protobuf
lquerel May 30, 2023
b9e8452
Add ER diagrams for metrics, logs, and traces
lquerel May 30, 2023
6786b8c
Update `Logs Payload` section
lquerel May 30, 2023
df28655
Update `Spans Arrow Mapping` section
lquerel May 30, 2023
e73155d
Update `Metrics Arrow Mapping` section
lquerel May 30, 2023
84dac53
Update `Mapping OTel Entities to Arrow Records` section
lquerel May 30, 2023
1b78ba6
Update 3-columns chart benchmark.
lquerel May 31, 2023
8f82586
Update 3-columns stacked bar benchmark.
lquerel Jun 1, 2023
72fbeba
Update collector diagrams.
lquerel Jun 1, 2023
fb17f4b
Simplify benchmark section.
lquerel Jun 1, 2023
2aa2047
update benchmark section.
lquerel Jun 1, 2023
21bd7cb
fix markdown issues
lquerel Jun 1, 2023
621475f
fix markdown issues
lquerel Jun 1, 2023
4bed578
Merge branch 'main' into main
lquerel Jun 1, 2023
09749cc
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
549061c
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
64f4a1d
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
ea6fd96
Update text/0156-columnar-encoding.md
lquerel Jun 12, 2023
ec77368
Update OTEP based on Tigran's comments.
lquerel Jun 12, 2023
6c2e9b5
Merge branch 'main' of https://github.com/lquerel/oteps
lquerel Jun 12, 2023
ce274de
Fix markdown lint issue
lquerel Jun 12, 2023
df06cea
Update OTEP based on Tigran's comments
lquerel Jun 12, 2023
f83a378
Change batch_id type from string to int64
lquerel Jun 20, 2023
9bdeab2
Rename sub_stream_id to schema_id (see https://github.com/open-teleme…
lquerel Jun 20, 2023
d5e8966
Create a new section Future Possibilities (see https://github.com/ope…
lquerel Jun 20, 2023
fae225a
Fix typo
lquerel Jun 20, 2023
1892d31
Minor change (request -> service)
lquerel Jun 20, 2023
5512264
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
560ca64
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
9f68a84
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
34fd4df
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
25bf657
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
7fbd269
Update based on @atoulme comments
lquerel Jun 26, 2023
0c3742d
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
0ffddbf
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
e5c64e9
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
0f1faf1
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
f26c725
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
ad06693
Update text/0156-columnar-encoding.md
lquerel Jun 26, 2023
a924a99
Add delta-delta encoding link
lquerel Jun 26, 2023
5bd1d00
Merge branch 'main' into main
reyang Jun 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update Logs and Traces Payload sections.
  • Loading branch information
lquerel committed Dec 28, 2022
commit 3cc34584b57ff602df4b5b3c7942aaa461e515cd
141 changes: 76 additions & 65 deletions text/0156-columnar-encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -765,74 +765,85 @@ resource_metrics:

Although simpler, a logs 'OtlpArrowPayload' takes a similar approach.

| Column name | Column type | Required | Description |
|------------------------------|-------------|----------|-----------------------------------------------------------------------------------------------------------------|
| `resource` | `struct` | No | Structure regrouping the fields of the resource |
| __`attributes` | `struct` | No | Structure regrouping the fields of the resource attributes |
| ____`[key]` | `dynamic` | No | A set of attributes, see [attribute section](#labelattribute-representation) for more details |
| __`dropped_attributes_count` | `uint32` | No | The number of resource dropped attributes. |
| __`schema_url` | `string` | No | Schema url. |
| `instrumentation_library` | `struct` | No | Structure regrouping the fields of the instrumentation library |
| __`name` | `string` | No | Name of the instrumentation library |
| __`version` | `string` | No | Version of the instrumentation library |
| `time_unix_nano` | `uint64` | Yes | The time when the event occurred. |
| `severity_number` | `uint8` | Yes | The severity number. |
| `severity_text` | `string` | No | The severity test. |
| `name` | `string` | No | Short event identifier that does not contain varying parts. |
| `body` | `dynamic` | No | The body of the log record (see below for more details). |
| `attributes` | `struct` | No | Structure regrouping the fields of the attributes |
| __`[key]` | `dynamic` | No | A set of attributes, see [attribute section](#labelattribute-representation) for more details |
| `dropped_attributes_count` | `uint64` | No | The number of dropped attributes. |
| `flags` | `uint32` | No | Flags, a bit field. 8 least significant bits are the trace flags as defined in W3C Trace Context specification. |
| `trace_id` | `binary` | No | Identifier of the trace. |
| `span_id` | `binary` | No | Identifier of the span. |

The type of the column `body` depends on the OTLP type and follows the same transformation rules used in the [attributes](#labelattribute-representation).
```yaml
resource_logs:
- resource:
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string | string_dictionary
scope_logs:
- scope:
name: string | string_dictionary
version: string | string_dictionary
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string | string_dictionary
logs:
- time_unix_nano: uint64
observed_time_unix_nano: uint64
trace_id: 16_bytes_binary | 16_bytes_binary_dictionary
span_id: 8_bytes_binary | 8_bytes_binary_dictionary
severity_number: int32
severity_text: string | string_dictionary
body: # arrow type: sparse union
str: string | string_dictionary
i64: int64
f64: float64
bool: bool
binary: binary | binary_dictionary
cbor: binary_dictionary | binary # cbor encoded complex body value
attributes: *attributes
dropped_attributes_count: uint32
flags: uint32
```

The type of the column `body` depends on the OTLP type and follows the same transformation rules used in the [attributes](#attribute-representation).

#### Spans Payload

The set of possible columns for a span payload is summarized in the following table.

| Column name | Column type | Required | Description |
|------------------------------|------------------|----------|-----------------------------------------------------------------------------------------------|
| `resource` | `struct` | No | Structure regrouping the fields of the resource |
| __`attributes` | `struct` | No | Structure regrouping the fields of the resource attributes |
| ____`[key]` | `dynamic` | No | A set of attributes, see [attribute section](#labelattribute-representation) for more details |
| __`dropped_attributes_count` | `uint32` | No | The number of resource dropped attributes. |
| __`schema_url` | `string` | No | Schema url. |
| `instrumentation_library` | `struct` | No | Structure regrouping the fields of the instrumentation library |
| __`name` | `string` | No | Name of the instrumentation library |
| __`version` | `string` | No | Version of the instrumentation library |
| `trace_id` | `binary` | Yes | Identifier of the trace. |
| `span_id` | `binary` | Yes | Identifier of the span. |
| `trace_state` | `string` | No | trace state. |
| `parent_span_id` | `string` | No | Parent span id. |
| `name` | `string` | No | A description of the span's operation. |
| `kind` | `uint8` | Yes | Distinguishes between spans generated in a particular context. |
| `start_time_unix_nano` | `uint64` | Yes | The start time of the span. |
| `end_time_unix_nano` | `uint64` | Yes | The end time of the span. |
| `attributes` | `struct` | No | Structure regrouping the fields of the attributes |
| __`[key]` | `dynamic` | No | A set of attributes, see [attribute section](#labelattribute-representation) for more details |
| `dropped_attributes_count` | `uint64` | No | The number of dropped attributes. |
| `events` | `list of struct` | No | List of events |
| __`time_unix_nano` | `uint64` | Yes | The time the event occurred |
| __`name` | `string` | Yes | The name of the event. |
| __`attributes` | `struct` | No | Structure regrouping the fields of the event attributes |
| ____`[key]` | `dynamic` | No | A set of attributes, see [label section](#labelattribute-representation) for more details. |
| __`dropped_attributes_count` | `uint64` | No | The number of dropper attributes. |
| `dropped_events_count` | `uint64` | No | The number of dropped events. |
| `links` | `list of struct` | No | List of links |
| __`trace_id` | `binary` | Yes | A unique identifier of a trace that this linked span is part of. |
| __`span_id` | `binary` | Yes | A unique identifier for the linked span. |
| __`trace_state` | `string` | Yes | The trace_state associated with the link. |
| __`attributes` | `struct` | No | Structure regrouping the fields of the event attributes |
| ____`[key]` | `dynamic` | No | A set of attributes, see [label section](#labelattribute-representation) for more details. |
| __`dropped_attributes_count` | `uint64` | No | The number of dropped attributes. |
| `dropped_links_count` | `uint64` | No | The number of dropped links. |
| `status` | `struct` | No | The status of the span. |
| __`deprecated_code` | `uint8` | No | The deprecated status code. |
| __`message` | `string` | No | The status message. |
| __`code` | `uint8` | No | The status code. |
The set of possible columns for a span payload is summarized in the following yaml description.

```yaml
resource_spans:
- resource:
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string | string_dictionary
scope_spans:
- scope:
name: string | string_dictionary
version: string | string_dictionary
attributes: *attributes
dropped_attributes_count: uint32
schema_url: string | string_dictionary
spans:
- start_time_unix_nano: uint64 # required
end_time_unix_nano: uint64 # required
trace_id: 16_bytes_binary | 16_bytes_binary_dictionary # required
span_id: 8_bytes_binary | 8_bytes_binary_dictionary # required
trace_state: string | string_dictionary
parent_span_id: 8_bytes_binary | 8_bytes_binary_dictionary
name: string | string_dictionary # required
kind: int32
attributes: *attributes
dropped_attributes_count: uint32
events:
- time_unix_nano: uint64
name: string | string_dictionary
attributes: *attributes
dropped_attributes_count: uint32
dropped_events_count: uint32
links:
- trace_id: 16_bytes_binary | 16_bytes_binary_dictionary
span_id: 8_bytes_binary | 8_bytes_binary_dictionary
trace_state: string | string_dictionary
attributes: *attributes
dropped_attributes_count: uint32
dropped_links_count: uint32
status:
code: int32
status_message: string | string_dictionary
```

## Implementation Recommendations

Expand Down