RFC: Structural definition of trace events & distinct structures for all data points #20

cstockton · 2019-05-22T15:12:22Z

First let me say I'm happy to see the combined efforts here, I'm a strong believer in the value added by this projects goals. The goal of this post is to get some clarification on some of the projects goals and design decisions. I have a lot of opinions on this area but for sake of productive discussion I want to focus on two key areas:

Defining a formal structural specification for trace events.
Declaring distinct events for things that require nesting child objects or holding buffers for any period of time.

The initial rough draft that I used to start the conversation in gitter can be found in this gist, it covers the two points above in more details. I'll be happy to fill this issue out with a more intelligible digest of the gist by request or once I have a better understanding of the projects direction.

Thanks a lot for the efforts being started here.

SergeyKanzhelev · 2019-05-22T17:16:14Z

After reading I understand that the biggest concern is the fact that spans are represented as a single object, not two or more start/end/associated-events records and require some in-memory buffer.

As OpenTracing allowed the event-based tracer to be implemented, the default implementation and data schema for OpenTelemetry assumes that a span is a single object. There are many reasons for this. To name the few are easier analytics and experiences as well as simpler extensibility and enriching of an object (as opposed to adding attributes as separate associated events). As for runtime concerns - in many cases trace identifiers, timer, and attributes like name are held in memory already for the purposes of attributing "children" and consistent timing of child spans. Span is not much bigger than this.

I'm curious - do you want to make sure that OpenTelemetry will have enough extensibility to implement an events-based system or you propose to change the default OpenTelemetry model?

cstockton · 2019-05-22T20:36:34Z

Hello,

As OpenTracing allowed the event-based tracer to be implemented, the default implementation and data schema for OpenTelemetry assumes that a span is a single object. There are many reasons for this. To name the few are easier analytics and experiences as well as simpler extensibility and enriching of an object (as opposed to adding attributes as separate associated events). As for runtime concerns - in many cases trace identifiers, timer, and attributes like name are held in memory already for the purposes of attributing "children" and consistent timing of child spans. Span is not much bigger than this.

I understand the need for ephemeral state for span book keeping and have no argument against that. I do have an argument against storing every single Log() event that occurs on the span itself. E.g. opentracing-go Log("", ...) because now we have fixed upper bounds on spans as I've explained inside the gist. But this is only half of it- it also means that I don't have observability of the span or it's log records until it's already done. Which completely negates part of the usefulness of instrumentation in my opinion. I don't get to see a hung process / worker until it's already exited. It also prevents you from using spans to record long-lived worker processes / core services, you have to be careful to log nothing in a loop that doesn't have a minimal upper bound, or you could potentially get "Span contains too many log records" and lose all your trace data. This has been a real problem, and it's very difficult to track down disappearing spans when you rely on them for visibility into your systems.

These reasons and many more are why I feel spans are deserving of N (span_start, span_stop, span_oneshot, insert other types here) distinct events. Knowing which spans are open at any given time is such a useful property that enables an entirely different classification of tooling. I do admit it also introduces a new set of potential problems (what happens when a span never closes?) - But this can be solved many different ways, food for thought:

infer from parent relationships
service specific timeouts
span specific timeouts
insert something clever discovered by someone much smarter than me

Maybe for the sake of keeping the conversation focused though we can argue the merits of "log records" being moved from the span into separate distinct events.

I'm curious - do you want to make sure that OpenTelemetry will have enough extensibility to implement an events-based system or you propose to change the default OpenTelemetry model?

I'm not sure what the "OpenTelemetry Model" is or where it relates in the project. Where may I find a high level architectural overview of that divides the standardization and software roles so I can see where specifications meet software? From what I've gathered both projects have mostly been software based, the software defines the data contracts and it's up to vendors to implement drivers for the SDK's. I feel like this is akin to HTTP not existing, instead each website has to implement a Firefox or Chrome driver to make their content accessible. I would rather tracing define HTTP, allowing choice of client and vendors choice of server. Maybe this is already the goal, I just haven't seen the right documentation.

That said- if the model you're referring to is this here: https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/proto/trace/v1/trace.proto#L200 - and this model will be accepted by tracing vendors ingest then I do in fact very strongly object to the current data model. I do not think it will age well while also being too limited as long as the following statement is true for spans:

A Span must be finished before it may be published to the tracing vendors ingest

If the above is a true statement, I would like to make a proposal against it. It means that any future features or extensions will not be observable until after the span is finished, which limits the use cases. I don't think distributed tracing spans should be invisible during the only time they are actively impacting the end user.

SergeyKanzhelev · 2019-05-22T22:21:04Z

I think this will clarify things - we see logs as a separate type defined in the future. Events on the span are time indications on the span, not a general purpose log. And events have quite an aggressive limit.

As for not-finished spans - those will be shown on z-Pages for instance. And vendors may send events about unfinished spans or keep them in separate storage. I think to make this proposal a reality and built-in feature of OpenTelemetry we need enough backends (ideally OSS) where we can use this feature.

yurishkuro · 2019-05-23T01:07:24Z

@cstockton Did you read somewhere that "A Span must be finished before it may be published to the tracing vendors ingest"? If this was explicitly listed as a requirement in the OpenTracing specs or elsewhere, then it's a mistake. Span model by itself does not impose this as a requirement. For example, even though we never implemented / finished it, we had some work / discussions about capturing "incomplete spans" in Jaeger, which sounds exactly like the issue you are concerned with. And even without that formal implementation, Jaeger (and Zipkin) internally allow multiple instances of a span with the same span ID to exist in storage and merge all their data points into one span on retrieval. We used that functionality in the past at Uber by writing additional spans to storage that contained log events captured from haproxy logs (haproxy itself did not support tracing).

cstockton · 2019-05-23T16:03:51Z

@SergeyKanzhelev If I'm understanding you correctly you're saying the type called TimedEvent which contains an Event at https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/proto/trace/v1/trace.proto#L176 is not for log events? Maybe that is not your intention, but having a repeated object, that has an arbitrary key/value mapping will certainly end up being used for log events. It will allow a paradigm identical to the one used in OpenTracing- log an event name and then give it key / value pairs, using "message" key for a message.

That said it is still unclear to me what the data model is, the only thing I've seen referred to is the Java proto buf spec. Are their any design documents for this system? I am really looking for the boundaries between software and specifications, as my primary concern is this project is using software as the specification for the structural representation of spans. These are huge decisions that will affect tracing for years, I think they should go through serious engineering rigor and public debate before a line of code is written. Maybe what we have now is what we end up with, but due diligence here is important.

@yurishkuro I've seen no documentation that lays these things out in plain language (part of the reason I opened this issue). I'm trying to infer behavior through the conversations I've had thus far, the implementations of OpenTracing and most importantly the java proto bufs. As it stands now the protobufs contain both a "start" and "end" time https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/proto/trace/v1/trace.proto#L146 - which implies it is held from being sent until it's completed. If that is false- and it is sent with a partial update, or a precursor event is sent before hand then it's not clearly spelled out anywhere.

Clearly spelling out the things I think is extremely important. There is a lot of work to be done at a 40K view before writing proto buffers and SDK's. I do not want to use the same model as opentracing- where tracing vendors have to import an API to ship trace data. That model does not scale for reasons I've covered, but would be happy to reiterate as well as reevaluate my position if some arguments for this model were presented?

I've yet to see any documentation for the current design decisions, or public record of the rigorous technical debate that cemented these design decisions (the Open portion of Open-Telemetry) - I understand they probably have, but need some help finding that information.

Thanks a lot for the quick responses, I want to make it clear I'm really excited about & appreciate this projects efforts.

SergeyKanzhelev · 2019-05-23T18:20:51Z

Maybe that is not your intention, but having a repeated object, that has an arbitrary key/value mapping will certainly end up being used for log events.

I think aggressive limit of a default implementation and also the fact that logging adaptors will do a different thing will make this use less common. I'm not totally oppose to the idea of detaching Events from spans. In fact this is what we will probably do for Azure Monitor. But I see how it may be important to associate events directly to the span to make a better root cause analysis experience. I'm sure @yurishkuro have an opinion here as for how useful those events for Jaeger.

I am really looking for the boundaries between software and specifications

Absolutely. We are bootstrapping the project at this moment. Any help is appreciated, btw. @c24t copied specs from OT and OC to work_in_progress folder and the idea is to move specs piece by piece out of that folder.

and it is sent with a partial update, or a precursor event is sent before hand then it's not clearly spelled out anywhere.

I think the default implementation and Agent/Collector expectation would be to wait for completed spans. To @yurishkuro's point - even though Zipkin and Jaeger support partial spans, there is no OSS backend that I know that has partial spans as a main source of data and a primary scenario. Lack of backend will inform our decisions on the default data model.

austinlparker · 2019-05-23T18:46:24Z

These reasons and many more are why I feel spans are deserving of N (span_start, span_stop, span_oneshot, insert other types here) distinct events.

I've actually prototyped something like this in the LightStep Tracer as 'meta events'. We create a span for each discrete event and add the span/traceid to it as a tag, along with a meta_event: true tag. The backend could then pick these out and treat them differently.

cstockton · 2019-05-23T21:40:26Z

I'm not totally oppose to the idea of detaching Events from spans

Small note on that just to be clear, I want to keep the relationship of Events to spans in every aspect of the system except for carrying the data through ingest. My goal is to advocate the benefits of breaking down the current data model into a stream of smaller observations. This allows data to be pushed to ingest the moment it has occurred placing only ingest latency between you and insights into your systems.

It also is naturally more extendable, because you may add new types of observations which may simply be ignored when support isn't available for a given target vendor. It paves the way for flexibility while solving the many engineering issues that come with storing so much information in a single structure. It provides more opportunities to identify failure and removes the need for software based bounds and counters indicating field omissions for some data points (I understand the need for field maps to have this limit, for example). I have touched on the many advantages of this already scatted about on gitter, gists and here so will spare everyone from sounding like a broken record.

I think the default implementation and Agent/Collector expectation would be to wait for completed spans. To @yurishkuro's point - even though Zipkin and Jaeger support partial spans, there is no OSS backend that I know that has partial spans as a main source of data and a primary scenario. Lack of backend will inform our decisions on the default data model.

While I understand your point, the lack of support from a backend for a feature that does not exist in the frontend doesn't really seem like an ideal way to gauge technical merit. It makes very little sense to add support for features your users can't leverage through the current interfaces exposed to them. It's also hard to push your products own SDK's while trying to support efforts in convergence. Your only choice is to propose extensions to the current data model. I feel the data model is not suitable for such extensions in it's current form.

--

I've learned in gitter just now that OpenTelemetry is adopting the software of OpenCensus. The data model is the OpenCensus agents format and any vendor has to write a OpenCensus agent adapter. I accept I am a single voice here and these decisions were already made. Perhaps documenting that OpenTelemetry is in fact OpenCensus from both a software and "specification (e.g. a protobuf data model the current agent uses)" so people don't create too much noise here as I have. Sorry for that!

jmacd · 2019-05-24T21:47:59Z

+1 to @cstockton's request for defining trace events in the specification. We shouldn't use OpenTracing's lack of an event model, nor the lack of OSS backend support as rationale. It should be possible to produce an OpenTelemetry implementation that does not keep span state in memory by writing START and FINISH events to a low-level event record. This is an appealing way to move "active span management" out of the process, where buffering and failure to finish spans can be a real problem.

Defining an event model should not require API changes, so I don't think this will be objectionable. Focusing on the cost of writing spans specifically, if you aim to write out all spans (as lightstep does) then it may be cheaper to simply encode and send events than to build intermediate objects which are then encoded and sent. It should be possible to build an implementation along these lines, which to me suggests a low-level "Observer" model, with stateless APIs above and stateful implementation below.

SergeyKanzhelev · 2019-05-25T15:01:38Z

@jmacd it is a fine goal to have an option to implement event-based tracer and backend. It should absolutely be possible with the API. When I was referring to the lack of backends I was talking about the default implementation and span's protos.

We can always extend the span's model to events model in the future.

jmacd · 2019-06-05T00:06:46Z

I've posted a prototype that implements an event-based exporter API and a stderr debug-logger on top of it, here:

open-telemetry/opentelemetry-go#4

cstockton · 2019-06-23T17:59:38Z

We can always extend the span's model to events model in the future.

@SergeyKanzhelev It seems that "We can always do this in the future" is the central argument in the gist I posted and in the OpenTelemetry project in general. The https://opentelemetry.io site states "OpenTelemetry is the next major version of the OpenTracing and OpenCensus projects" meaning this is the future, right now- correct? If we change nothing about OpenTelemetry and simply adopt the model of OpenCensus it's not a new version, it's the deletion of OpenTracing in favor for OpenCensus.

I don't see any way this statement is viable, so I need your help understanding your plans for "Doing it in the future" so could you please provide me your analysis on this question:

Given we have parallel development efforts across every common programming language designing around the current interface you've created, how will you fix these fundamental design issues "in the future"?

I think if this argument is going to be used as the sole winning argument in technical debates it is only fair to provide a technical response. If you can't do this I believe we should consider it invalid and reevaluate the design as proposed using methods we can measure.

Edit:
I just want to spell out plainly the main issue I have with marching forward using the existing design under the premise we can change it in the future. The existing code of OpenTelemetry is being used to block proposals since the moments after the project was announced. Why would it not continue to be the case in the coming months as more code and resources are invested?

SergeyKanzhelev · 2019-06-24T06:19:11Z

@cstockton please read our goals for this version: https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0

As for the alternative model - OpenTracing used to have some streaming tracers. And we keep those implementations in mind while discussing API (not SDK). So API will allow to implement an alternative model based on OpenTracing experience. Some even say they believe it may be build based on SDK extensibility we plan to have. But till September this is unlikely something we will get into.

If you want to work on alternative SDK implementation and provide feedback on API/SDK based on this implementation - this will be great.

carlosalberto · 2019-06-24T14:19:02Z

"We can always do this in the future" is the central argument in the gist I posted and in the OpenTelemetry project in general.

Observe this is due to 1) Trying to easy backwards compatibility and 2) Stick to what OT/OC have, to move faster for now (this is why we are not focusing on adding new stuff). The exception is, of course, things that greatly need a lift - but otherwise, we are being slightly conservative ;)

(Not to discourage you to discuss things, just a reminder of the current approach)

tedsuo · 2019-12-04T16:22:21Z

@cstockton closing this issue as it is stale, but I believe your concerns are being addressed! Also, please check out this latest OTEP, as it is relevant: open-telemetry/oteps#69

* Add OSX to CI * Fix yaml * Fix osx build. * Update xcode. * Change build dir default path. * Fix path * Alias bazel on osx. * set up bazel symbol link on osx

Fix links in repository.md

added in open-telemetry#17

SergeyKanzhelev added the question Question for discussion label May 22, 2019

SergeyKanzhelev added this to the API revision: 07-2019 milestone Jun 3, 2019

cstockton mentioned this issue Jun 23, 2019

Issues with current design open-telemetry/opentelemetry-go#19

Closed

cstockton mentioned this issue Jun 23, 2019

Define API compatibility story #115

Closed

cstockton mentioned this issue Jul 12, 2019

Why so many interfaces? open-telemetry/opentelemetry-go#23

Closed

iredelmeier added the needs discussion Need more information before all suitable labels can be applied label Jul 30, 2019

SergeyKanzhelev modified the milestones: API revision: 07-2019, Future Sep 27, 2019

tedsuo closed this as completed Dec 4, 2019

rockb1017 pushed a commit to rockb1017/opentelemetry-specification that referenced this issue Nov 18, 2021

Merge pull request open-telemetry#20 from mateuszrzeszutek/patch-1

a2697d9

Fix links in repository.md

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 21, 2024

Add missing status to RFC 0002 (open-telemetry#20)

500610e

added in open-telemetry#17

carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this issue Oct 23, 2024

Add missing status to RFC 0002 (open-telemetry#20)

892f1cf

added in open-telemetry#17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Structural definition of trace events & distinct structures for all data points #20

RFC: Structural definition of trace events & distinct structures for all data points #20

cstockton commented May 22, 2019

SergeyKanzhelev commented May 22, 2019

cstockton commented May 22, 2019 •

edited

Loading

SergeyKanzhelev commented May 22, 2019 •

edited

Loading

yurishkuro commented May 23, 2019

cstockton commented May 23, 2019

SergeyKanzhelev commented May 23, 2019

austinlparker commented May 23, 2019

cstockton commented May 23, 2019

jmacd commented May 24, 2019

SergeyKanzhelev commented May 25, 2019

jmacd commented Jun 5, 2019

cstockton commented Jun 23, 2019 •

edited

Loading

SergeyKanzhelev commented Jun 24, 2019

carlosalberto commented Jun 24, 2019

tedsuo commented Dec 4, 2019

RFC: Structural definition of trace events & distinct structures for all data points #20

RFC: Structural definition of trace events & distinct structures for all data points #20

Comments

cstockton commented May 22, 2019

SergeyKanzhelev commented May 22, 2019

cstockton commented May 22, 2019 • edited Loading

SergeyKanzhelev commented May 22, 2019 • edited Loading

yurishkuro commented May 23, 2019

cstockton commented May 23, 2019

SergeyKanzhelev commented May 23, 2019

austinlparker commented May 23, 2019

cstockton commented May 23, 2019

jmacd commented May 24, 2019

SergeyKanzhelev commented May 25, 2019

jmacd commented Jun 5, 2019

cstockton commented Jun 23, 2019 • edited Loading

SergeyKanzhelev commented Jun 24, 2019

carlosalberto commented Jun 24, 2019

tedsuo commented Dec 4, 2019

cstockton commented May 22, 2019 •

edited

Loading

SergeyKanzhelev commented May 22, 2019 •

edited

Loading

cstockton commented Jun 23, 2019 •

edited

Loading