Generated pdata code should create interfaces #1346

jmacd · 2020-07-14T16:14:32Z

Is your feature request related to a problem? Please describe.

I've been working to answer a performance question about OTLP in the collector, specifically w.r.t. metrics. I have run Tigran's benchmarks and confirmed the expected performance of the change in open-telemetry/opentelemetry-proto#162. It's behaving as expected: slightly higher overhead and slightly higher memory usage, which leaves us with a question: should we let performance dictate the structure of the protocol, or should we let semantics and convenience guide us, and is this really an exclusive choice? A the root of this, protocol buffers suck in Go and we can't really fix that on our own. More in this 🧵

Describe the solution you'd like

I want to propose a long-term solution to this that lets us move quickly in the short term. We can address the performance question later. Here are the details:

The changes in proto#162 degrade performance slightly, as expected, but we could regain this performance (for the most part) by hand-coding a decoder for use in the collector. In some ways the code is already set up for this: we have the consumer/pdata package which contains a wrapper around the protocol structure. We could (in theory, almost) introduce a second representation of OTLP data to facilitate this.

Describe alternatives you've considered

One problem is the generated code created by cmd/pdatagen generates structs, not interfaces. I believe we could change the generated code to use interfaces w/o a loss of performance. Instead of structs containing pointers, we'll use pointers to structs that satisfy an interface. Then we could introduce a second representation that did not use the proto message as the internal representation in a pipeline.

This is a nice story, until some module of code wants to modify or directly access a proto message. The more code that's written to directly access a proto, the worse this situation gets. There is a little bit of code (the batch processor, mainly) that combines proto messages in a pipeline, but it's not doing so recursively, so it's a special case.

Here's what I want: I want to move forward with the protocol change in #162 as I can show that it's not a large change in performance. As secondary goals: (1) change the generated pdata code to use interfaces, to allow multiple representations of OTLP in the collector, (2) potentially add a hand-coded implementation of OTLP that saves CPU and memory, (3) potentially refine the OTLP design to perform better by design: this would include interning of label sets to avoid repetition and to facilitate easier processing in the collector.

Lastly I want to note that before #162 the Metric struct is larger (because it has four slices) and after #162 the DataPoint struct is larger (because it is a logical oneof). Before #162 there are three unused slices (72 bytes wasted), and after #162 there are about the same number of wasted bytes (depending on whether exemplars are separated from raw_values). So when we have a single DataPoint per Metric the change in #162 is a break-even from a memory consumption point of view.

bogdandrutu · 2020-07-14T17:33:49Z

If we are getting to the discussion of prioritizing semantics, probably we should use oneof where we need, rather than having logical definitions for that.

bogdandrutu · 2020-07-14T17:57:43Z

@jmacd also we will deal with interface or concrete types internally in the collector once the protocol is stable. I don't think the way how we generate the internal format should be influencing the protocol.

tigrannajaryan · 2020-10-02T20:14:56Z

I believe we are settled on pdata for 1.0. I am going to close this, please reopen if there are any objections.

…ry#1346) * test: Add CI scenarios for eBPF Chart. open-telemetry#964 Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * test: Add CI scenarios for eBPF Chart. [issue-964] Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * test: Add CI scenarios for eBPF Chart. [issue-964] Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * test: Add CI scenarios for eBPF Chart. [issue-964] Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * test: Add CI scenarios for eBPF Chart. [issue-964] Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * test: Add CI scenarios for eBPF Chart. [issue-964] Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * fix: EBPF_NET_CRASH_METRIC_PORT env in ebpf chart not being quoted issue. Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * feat: grant ebpf k8s collector job watching permission. Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * feat: grant ebpf k8s collector job watching permission. Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> * feat: grant ebpf k8s collector job watching permission. Signed-off-by: Yang, Robin <Robin.Yang@fmr.com> --------- Signed-off-by: Yang, Robin <Robin.Yang@fmr.com>

jmacd added the feature request label Jul 14, 2020

jmacd mentioned this issue Jul 14, 2020

OTLP Metrics update open-telemetry/opentelemetry-proto#162

Closed

jrcamp added this to the GA 1.0 milestone Jul 30, 2020

tigrannajaryan closed this as completed Oct 2, 2020

andrewhsu added the enhancement New feature or request label Jan 6, 2021

MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this issue Nov 11, 2021

Update the package docs for the new API layout (open-telemetry#1346)

7022c12

Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024

Fix output time in metrics OStream exporter (open-telemetry#1346)

3a4a3a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated pdata code should create interfaces #1346

Generated pdata code should create interfaces #1346

jmacd commented Jul 14, 2020

bogdandrutu commented Jul 14, 2020

bogdandrutu commented Jul 14, 2020

tigrannajaryan commented Oct 2, 2020

Generated pdata code should create interfaces #1346

Generated pdata code should create interfaces #1346

Comments

jmacd commented Jul 14, 2020

bogdandrutu commented Jul 14, 2020

bogdandrutu commented Jul 14, 2020

tigrannajaryan commented Oct 2, 2020