Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLP over message systems #157

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alexvanboxel
Copy link

No description provided.

@@ -0,0 +1,110 @@
# OTLP over messaging systems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to rename your file to the PR # you've been assigned (0157).

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am temporarily blocking this since I have several questions that are not answered by the OTEP and I think they should be answered before we commit to making this part of the Otel spec.

Comment on lines +66 to +70
| `jaeger_proto` | X | - | - | the payload is deserialized to a single Jaeger proto `Span` |
| `jaeger_json` | X | - | - | the payload is deserialized to a single Jaeger JSON Span using `jsonpb` |
| `zipkin_proto` | X | - | - | the payload is deserialized into a list of Zipkin proto spans |
| `zipkin_json` | X | - | - | the payload is deserialized into a list of Zipkin V2 JSON spans |
| `zipkin_thrift` | X | - | - | the payload is deserialized into a list of Zipkin Thrift spans |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are unrelated to OTLP. I think the shouldn't be in this OTEP.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it, but I'll leave it here to give other people a chance to comment.


Optionally an implementation could support `otlp_json` or other alternative
encodings like jaeger or zipkin payloads. Alternative encodings should be configured
by an `encoding` field in the configuration file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What configuration file is this referring to? Does this assume that every implementation of OTLP over messaging system must have a configuration file? It seems like a stretch to me. Can it be a in-memory config option? Or an API parameter of a library that implements this OTEP?

Comment on lines 48 to 52
For log data, implementors of a receiver are encouraged to also support
receiving raw data from a topic. This enables scenarios were
non-OpenTelemetry log producers can produce a stream of logs, either
structured or unstructured, on that topic and the receiver wraps the log
data in a configurable resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to OTLP. Should it be in this OTEP?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not, it would be nice though that implements do this. But maybe it shouldn't be in the spec.

text/0000-otlp-over-messaging-systems.md Outdated Show resolved Hide resolved

The `kafkareceiver` and `kafkaexporter` already implement this OTLP as
described in the OpenTelemetry Collector. The description in th OTEP
makes both of them compliant without modification. Although it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good. But let's also be aware this is not a must-have requirement. If we believe what Collector does is not the best we can do then we should be able to propose something different in this OTEP and fix the Collector (it is still in Beta and breaking changes are allowed).

spec, to leverage the conventions for attributes?

No guaranty of order could be given, because not all systems support order.
Is that a problem?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a problem since OTLP conceptually does not guarantee order of delivery in network protocol.

Comment on lines +56 to +58
The default implementation must implement `otlp_proto`, meaning that the
payload is Protobuf serialized from `ExportTraceServiceRequest` for traces,
`ExportMetricsServiceRequest` for metrics, or `ExportLogsServiceRequest` for
Copy link
Member

@tigrannajaryan tigrannajaryan May 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is not clear to me is if we want to include the signal type in the serialized payload so that the recipient can tell what signal is. Since this OTEP does not define it it means the recipient that works with multiple signals needs another way to figure it out (often ends up using the channel name). I think this is an important capability that adds value and has no obvious downsides.

This is also an issue we want to solve for OTLP/File format: open-telemetry/opentelemetry-specification#1443 (comment)
It will be good to solve this here as well and do it in the same way we will do for OTLP/File so this likely requires to think through the OTLP/File design as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't help with the file format at all, but Kafka and Pubsub (maybe others) have key-value headers/attributes which are mentioned in "Open questions". The signal type and encoding (json or proto) could go in there

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment in the OTLP/File discussion about CloudEvents. Certainly, if you want to allow multiplexing logs, traces, and metrics over a single topic CloudEvents is a good fit and it would align with this OTLP as well as the spec defines bindings for different messaging systems:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexvanboxel I read the comment you left in the OTLP/File issue but I am a bit confused why it talks about the CloudEvents or Avro.

My comment here is about how an OTLP payload that is encoded as a Protobuf or JSON should be recorded in messaging systems such that the signal type information is included. I think we should keep it as simple as possible, new any additional concepts and technology introduced (CloudEvents or Avro) IMO is unnecessary complication.

If we want to record the type of the signal there are fairly ways to achieve it while staying with the existing set of technologies. For example we can add this message to OTLP Protobuf definitions:

// This may be also named simply Request or Payload or something like that
message ExportServiceRequest {
  oneof request {
    ExportTraceServiceRequest traces = 1;
    ExportMetricsServiceRequest metrics = 2;
    ExportLogsServiceRequest logs = 3;
  }
}

ExportServiceRequest can be then serialized and become the payload in the messaging system. Recipients can then deserialize and will be able to use the right signal in the "request" field.

Copy link
Author

@alexvanboxel alexvanboxel May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan the consensus is that we want to signal the type of the message, so I'm already changing my application. Let me demonstrate what I mean by leveraging the CloudEvent spec (as it's a spec and not technology). In my Google Pubsub implementation, it describes what headers to use. So I could simple use:

func (receiver *pubsubReceiver) detectEncoding(attributes map[string]string) (Encoding, error) {
	ceType := attributes["ce-type"]
	ceContentType := attributes["ce-datacontenttype"]
	if strings.HasSuffix(ceContentType,"application/x-protobuf") {
		switch ceType {
		case "org.opentelemetry.otlp.traces.v1":
			return OtlpProtoTrace, nil
		case "org.opentelemetry.otlp.metrics.v1":
			return OtlpProtoMetric, nil
		case "org.opentelemetry.otlp.logs.v1":
			return OtlpProtoLog, nil
		}
	} else if strings.HasSuffix(ceContentType,"application/json") {
		switch ceType {
		case "org.opentelemetry.otlp.traces.v1":
			return OtlpJsonTrace, nil
		case "org.opentelemetry.otlp.metrics.v1":
			return OtlpJsonMetric, nil
		case "org.opentelemetry.otlp.logs.v1":
			return OtlpJsonLog, nil
		}
	}
	return Unknown, nil
}

Granted, it's heavier than a oneOf, but it leaves the flexibility of even signaling OTLP/Json encoding.

text/0000-otlp-over-messaging-systems.md Outdated Show resolved Hide resolved
text/0000-otlp-over-messaging-systems.md Outdated Show resolved Hide resolved
Comment on lines +56 to +58
The default implementation must implement `otlp_proto`, meaning that the
payload is Protobuf serialized from `ExportTraceServiceRequest` for traces,
`ExportMetricsServiceRequest` for metrics, or `ExportLogsServiceRequest` for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't help with the file format at all, but Kafka and Pubsub (maybe others) have key-value headers/attributes which are mentioned in "Open questions". The signal type and encoding (json or proto) could go in there

for reading raw log events avoids having an additional parallel
implementation todo just that.

The encoding field should be used to indicate that the data received is not the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by "field" here?


## Open questions

This proposal doesn't take advantage of attributes that some systems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think it would be good to put encoding and signal type in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I think perhaps having a standard convention for knowing what signal is coming in

(either force separate channels for each signal, OR a standard header of some sort to disambiguate) would be good. OTLP effectively has this in their GRPC definition via the separate services.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, gRPC can handle this over the same endpoint (a separate service is just a path in the HTTP/2 header).

I'm leaning toward the CloudEvents proposal, where you can have multiple (at least have the possibility to) types over one topic.

@@ -0,0 +1,110 @@
# OTLP over messaging systems

Best practices for transporting OTLP data over a message system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any performance best practices for message systems based on the existing Collector Kafka implementation? For example should you prefer to batch into many small messages or into fewer large messages?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make it part of the spec, just like it is not part of the OTLP/gRPC spec.


## Prior art and alternatives

The `kafkareceiver` and `kafkaexporter` already implement this OTLP as
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I'm proposing some changes to these:

open-telemetry/opentelemetry-collector#3044
open-telemetry/opentelemetry-collector#3200

Still a WIP.

Comment on lines +66 to +70
| `jaeger_proto` | X | - | - | the payload is deserialized to a single Jaeger proto `Span` |
| `jaeger_json` | X | - | - | the payload is deserialized to a single Jaeger JSON Span using `jsonpb` |
| `zipkin_proto` | X | - | - | the payload is deserialized into a list of Zipkin proto spans |
| `zipkin_json` | X | - | - | the payload is deserialized into a list of Zipkin V2 JSON spans |
| `zipkin_thrift` | X | - | - | the payload is deserialized into a list of Zipkin Thrift spans |
Copy link

@jrcamp jrcamp May 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proposing changes to these in PR linked above. The types would be something like zipkinv1/thrift, zipkinv2/protobuf. Aside from the formatting it treats zipkinv1 and zipkinv2 as two separate protocols instead of putting them both behind zipkin.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm going to remove the other encoding out of the spec as @tigrannajaryan suggests

alexvanboxel and others added 3 commits May 18, 2021 07:56
Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com>
Co-authored-by: Aaron Abbott <aaronabbott@google.com>
Co-authored-by: Aaron Abbott <aaronabbott@google.com>
@tedsuo
Copy link
Contributor

tedsuo commented Jul 24, 2023

@alexvanboxel we are marking OTEPs which are no longer being updated as stale. We will close this PR in one week. If you wish to resurrect this PR, that is fine.

@alexvanboxel
Copy link
Author

I've neglected some of my OpenTelemetry work for the last few months. I will refocus some of my time back on open source. I will pick it back up, including the work in the collector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This issue or PR is stale and will be closed soon unless it is resurrected by the author. triaged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants