-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTLP over message systems #157
base: main
Are you sure you want to change the base?
Conversation
cc98959
to
a3508b4
Compare
a3508b4
to
78c3eb7
Compare
@@ -0,0 +1,110 @@ | |||
# OTLP over messaging systems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to rename your file to the PR # you've been assigned (0157).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am temporarily blocking this since I have several questions that are not answered by the OTEP and I think they should be answered before we commit to making this part of the Otel spec.
| `jaeger_proto` | X | - | - | the payload is deserialized to a single Jaeger proto `Span` | | ||
| `jaeger_json` | X | - | - | the payload is deserialized to a single Jaeger JSON Span using `jsonpb` | | ||
| `zipkin_proto` | X | - | - | the payload is deserialized into a list of Zipkin proto spans | | ||
| `zipkin_json` | X | - | - | the payload is deserialized into a list of Zipkin V2 JSON spans | | ||
| `zipkin_thrift` | X | - | - | the payload is deserialized into a list of Zipkin Thrift spans | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are unrelated to OTLP. I think the shouldn't be in this OTEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it, but I'll leave it here to give other people a chance to comment.
|
||
Optionally an implementation could support `otlp_json` or other alternative | ||
encodings like jaeger or zipkin payloads. Alternative encodings should be configured | ||
by an `encoding` field in the configuration file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What configuration file is this referring to? Does this assume that every implementation of OTLP over messaging system must have a configuration file? It seems like a stretch to me. Can it be a in-memory config option? Or an API parameter of a library that implements this OTEP?
For log data, implementors of a receiver are encouraged to also support | ||
receiving raw data from a topic. This enables scenarios were | ||
non-OpenTelemetry log producers can produce a stream of logs, either | ||
structured or unstructured, on that topic and the receiver wraps the log | ||
data in a configurable resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated to OTLP. Should it be in this OTEP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not, it would be nice though that implements do this. But maybe it shouldn't be in the spec.
|
||
The `kafkareceiver` and `kafkaexporter` already implement this OTLP as | ||
described in the OpenTelemetry Collector. The description in th OTEP | ||
makes both of them compliant without modification. Although it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good. But let's also be aware this is not a must-have requirement. If we believe what Collector does is not the best we can do then we should be able to propose something different in this OTEP and fix the Collector (it is still in Beta and breaking changes are allowed).
spec, to leverage the conventions for attributes? | ||
|
||
No guaranty of order could be given, because not all systems support order. | ||
Is that a problem? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be a problem since OTLP conceptually does not guarantee order of delivery in network protocol.
The default implementation must implement `otlp_proto`, meaning that the | ||
payload is Protobuf serialized from `ExportTraceServiceRequest` for traces, | ||
`ExportMetricsServiceRequest` for metrics, or `ExportLogsServiceRequest` for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that is not clear to me is if we want to include the signal type in the serialized payload so that the recipient can tell what signal is. Since this OTEP does not define it it means the recipient that works with multiple signals needs another way to figure it out (often ends up using the channel name). I think this is an important capability that adds value and has no obvious downsides.
This is also an issue we want to solve for OTLP/File format: open-telemetry/opentelemetry-specification#1443 (comment)
It will be good to solve this here as well and do it in the same way we will do for OTLP/File so this likely requires to think through the OTLP/File design as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't help with the file format at all, but Kafka and Pubsub (maybe others) have key-value headers/attributes which are mentioned in "Open questions". The signal type and encoding (json or proto) could go in there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment in the OTLP/File discussion about CloudEvents. Certainly, if you want to allow multiplexing logs, traces, and metrics over a single topic CloudEvents is a good fit and it would align with this OTLP as well as the spec defines bindings for different messaging systems:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexvanboxel I read the comment you left in the OTLP/File issue but I am a bit confused why it talks about the CloudEvents or Avro.
My comment here is about how an OTLP payload that is encoded as a Protobuf or JSON should be recorded in messaging systems such that the signal type information is included. I think we should keep it as simple as possible, new any additional concepts and technology introduced (CloudEvents or Avro) IMO is unnecessary complication.
If we want to record the type of the signal there are fairly ways to achieve it while staying with the existing set of technologies. For example we can add this message to OTLP Protobuf definitions:
// This may be also named simply Request or Payload or something like that
message ExportServiceRequest {
oneof request {
ExportTraceServiceRequest traces = 1;
ExportMetricsServiceRequest metrics = 2;
ExportLogsServiceRequest logs = 3;
}
}
ExportServiceRequest can be then serialized and become the payload in the messaging system. Recipients can then deserialize and will be able to use the right signal in the "request" field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan the consensus is that we want to signal the type of the message, so I'm already changing my application. Let me demonstrate what I mean by leveraging the CloudEvent spec (as it's a spec and not technology). In my Google Pubsub implementation, it describes what headers to use. So I could simple use:
func (receiver *pubsubReceiver) detectEncoding(attributes map[string]string) (Encoding, error) {
ceType := attributes["ce-type"]
ceContentType := attributes["ce-datacontenttype"]
if strings.HasSuffix(ceContentType,"application/x-protobuf") {
switch ceType {
case "org.opentelemetry.otlp.traces.v1":
return OtlpProtoTrace, nil
case "org.opentelemetry.otlp.metrics.v1":
return OtlpProtoMetric, nil
case "org.opentelemetry.otlp.logs.v1":
return OtlpProtoLog, nil
}
} else if strings.HasSuffix(ceContentType,"application/json") {
switch ceType {
case "org.opentelemetry.otlp.traces.v1":
return OtlpJsonTrace, nil
case "org.opentelemetry.otlp.metrics.v1":
return OtlpJsonMetric, nil
case "org.opentelemetry.otlp.logs.v1":
return OtlpJsonLog, nil
}
}
return Unknown, nil
}
Granted, it's heavier than a oneOf
, but it leaves the flexibility of even signaling OTLP/Json encoding.
The default implementation must implement `otlp_proto`, meaning that the | ||
payload is Protobuf serialized from `ExportTraceServiceRequest` for traces, | ||
`ExportMetricsServiceRequest` for metrics, or `ExportLogsServiceRequest` for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't help with the file format at all, but Kafka and Pubsub (maybe others) have key-value headers/attributes which are mentioned in "Open questions". The signal type and encoding (json or proto) could go in there
for reading raw log events avoids having an additional parallel | ||
implementation todo just that. | ||
|
||
The encoding field should be used to indicate that the data received is not the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is meant by "field" here?
|
||
## Open questions | ||
|
||
This proposal doesn't take advantage of attributes that some systems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I think it would be good to put encoding and signal type in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I think perhaps having a standard convention for knowing what signal is coming in
(either force separate channels for each signal, OR a standard header of some sort to disambiguate) would be good. OTLP effectively has this in their GRPC definition via the separate services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, gRPC can handle this over the same endpoint (a separate service is just a path in the HTTP/2 header).
I'm leaning toward the CloudEvents proposal, where you can have multiple (at least have the possibility to) types over one topic.
@@ -0,0 +1,110 @@ | |||
# OTLP over messaging systems | |||
|
|||
Best practices for transporting OTLP data over a message system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any performance best practices for message systems based on the existing Collector Kafka implementation? For example should you prefer to batch into many small messages or into fewer large messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not make it part of the spec, just like it is not part of the OTLP/gRPC spec.
|
||
## Prior art and alternatives | ||
|
||
The `kafkareceiver` and `kafkaexporter` already implement this OTLP as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I'm proposing some changes to these:
open-telemetry/opentelemetry-collector#3044
open-telemetry/opentelemetry-collector#3200
Still a WIP.
| `jaeger_proto` | X | - | - | the payload is deserialized to a single Jaeger proto `Span` | | ||
| `jaeger_json` | X | - | - | the payload is deserialized to a single Jaeger JSON Span using `jsonpb` | | ||
| `zipkin_proto` | X | - | - | the payload is deserialized into a list of Zipkin proto spans | | ||
| `zipkin_json` | X | - | - | the payload is deserialized into a list of Zipkin V2 JSON spans | | ||
| `zipkin_thrift` | X | - | - | the payload is deserialized into a list of Zipkin Thrift spans | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm proposing changes to these in PR linked above. The types would be something like zipkinv1/thrift
, zipkinv2/protobuf
. Aside from the formatting it treats zipkinv1 and zipkinv2 as two separate protocols instead of putting them both behind zipkin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to remove the other encoding out of the spec as @tigrannajaryan suggests
Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com>
Co-authored-by: Aaron Abbott <aaronabbott@google.com>
Co-authored-by: Aaron Abbott <aaronabbott@google.com>
@alexvanboxel we are marking OTEPs which are no longer being updated as |
I've neglected some of my OpenTelemetry work for the last few months. I will refocus some of my time back on open source. I will pick it back up, including the work in the collector. |
No description provided.