Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace/Span ids base64 encoded when used in proto-JSON representation #786

Closed
bogdandrutu opened this issue Mar 16, 2020 · 49 comments · Fixed by #911
Closed

Trace/Span ids base64 encoded when used in proto-JSON representation #786

bogdandrutu opened this issue Mar 16, 2020 · 49 comments · Fixed by #911
Assignees
Labels
area:miscellaneous For issues that don't match any other area label priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:protocol Related to the specification/protocol directory

Comments

@bogdandrutu
Copy link
Member

As pointed in open-telemetry/opentelemetry-proto#110 currently trace_id and span_id are bytes which requires base64 encoded in a proto-JSON representation. This makes them look in JSON very different from the hex representation used in Trace-Context. Here are some options:

  1. Change trace/span id to string hex representation;
  2. Define a different json schema for otlp HTTP;
  3. Try to push to protobuf support custom encoding for bytes fields;
@Oberon00
Copy link
Member

In theory, I would love for W3C to use base64 instead of hex. After all it's shorter and at some cloud vendor setups, I've heard, you are paying per byte you send/receive (e.g. if the HTTP request is ultimately serialized to a messaging system).

@SergeyKanzhelev
Copy link
Member

SergeyKanzhelev commented Mar 16, 2020

I would love for W3C to use base64 instead of hex.

@Oberon00 I don't think changing W3C is feasible at this stage. For history - base64 vs hex was heavily debated. Unfortunately this debate hasn't made it to the rationale, but as overview - we strived for human-readability and consistency with prior art.

@bogdandrutu what's the pros/cons of each option?

Change trace/span id to string hex representation;

This is 4x size and CPU on serialization/deserialization, correct? Also potential abuse with other characters. Any other issues?

Define a different json schema for otlp HTTP;

Have you looked at the whole schema? Any other issues with JSON to justify the custom schema?

Try to push to protobuf support custom encoding for bytes fields;

I have no context on this. Can you comment on pros/cons?

@tigrannajaryan
Copy link
Member

Change trace/span id to string hex representation;

I don't think we should do this and slow-down / increase wire size for all other languages.

@SergeyKanzhelev
Copy link
Member

@tigrannajaryan to assess the second option - can you generate an example of a JSON from proto? So we can review for other potential issues?

Can you please also comment on option 3 viability?

@tigrannajaryan
Copy link
Member

@SergeyKanzhelev unfortunately I don't know how 2 and 3 done, I have no experience with that. It is best that someone who knows what they are doing works on this.

@tensorchen
Copy link
Member

tensorchen commented May 1, 2020

@SergeyKanzhelev @tigrannajaryan This is an example of a JSON from proto.

{
    "resource_spans": [
        {
            "instrumentation_library_spans": [
                {
                    "spans": [
                        {
                            "trace_id": "kDMI7LTxLxTj220awNARJw==",
                            "span_id": "9ir6veJ4Hdw=",
                            "parent_span_id": "bgnsqqPvjYQ=",
                            "name": "Sample-8",
                            "kind": 1,
                            "start_time_unix_nano": 1588334156464409000,
                            "end_time_unix_nano": 1588334156470454639,
                            "attributes": [
                                {
                                    "key": "attr",
                                    "string_value": "value3"
                                },
                                {
                                    "key": "attr3",
                                    "string_value": "value4"
                                }
                            ],
                            "events": [
                                {
                                    "time_unix_nano": 464430000,
                                    "name": "event1",
                                    "attributes": [
                                        {
                                            "key": "key1",
                                            "type": 3,
                                            "bool_value": true
                                        }
                                    ]
                                },
                                {
                                    "time_unix_nano": 464438000,
                                    "name": "event2",
                                    "attributes": [
                                        {
                                            "key": "key2",
                                            "string_value": "value2"
                                        }
                                    ]
                                }
                            ],
                            "status": {
                                "message": "ok"
                            }
                        }
                    ]
                }
            ]
        }
    ]
}

In spec https://github.com/open-telemetry/oteps/blob/master/text/0099-otlp-http.md

Does it only support application/x-protobuf, not application/json?

@tigrannajaryan
Copy link
Member

@tensorchen Correct, OTLP/HTTP currently defines only "application/x-protobuf". If there is a desire to send JSON payload that needs to be explored and added as an extension to the protocol. I am not sure what exactly are the implications of that so that will require some investigation.

@nilebox
Copy link
Member

nilebox commented May 13, 2020

@bogdandrutu @tigrannajaryan What's the current state of HTTP/JSON support in OTLP?
Can we start working on supporting it on the client (SDK) side, or should we wait until this issue is resolved?

@bogdandrutu
Copy link
Member Author

Would not start to support yet HTTP/JSON as another protocol, we currently have gRPC/Protobuf, gRPC over HTTP (with the limitation of the trace/span id), and HTTP/Protobuf.

@nilebox
Copy link
Member

nilebox commented Jun 24, 2020

Raised open-telemetry/opentelemetry-collector#1177 to disable support for JSON encoding in the Collector until we decide to officially support it.

Shall we also remove the Swagger code generation in https://github.com/open-telemetry/opentelemetry-proto/blob/e43e1abc40428a6ee98e3bfd79bec1dfa2ed18cd/makefile#L50-L55 ?

@nilebox
Copy link
Member

nilebox commented Jun 25, 2020

Since Collector is now using gogo/protobuf, we may configure protos to generate trace/span IDs with a custom type via https://github.com/gogo/protobuf/blob/master/custom_types.md

It shouldn't lead to any issues with binary wire compatibility, but the solution is specific to Go, so SDKs for other languages won't benefit from this.

@tigrannajaryan
Copy link
Member

I am going to assign this to myself to have a look. My preliminary feeling is that since W3C Trace context uses hex then we should aim to do the same (as @SergeyKanzhelev rightfully points out it is too late to change W3C Trace context). I do not know how feasible it is to change Protobuf's to use hex encoding instead of base64, I will look into this.

@tigrannajaryan tigrannajaryan self-assigned this Aug 11, 2020
@tigrannajaryan
Copy link
Member

Traceid/spanid are also part of the Trace protocol which we declared stable. I think it is OK that we make this change (should we decide that it is desirable and doable) and we are not actually breaking our promise for Trace protocol stability which is only about binary Protobuf encoding and not JSON encoding which was introduced very recently (open-telemetry/oteps#122) and where this issue was not debated at all.

@bogdandrutu bogdandrutu transferred this issue from open-telemetry/opentelemetry-proto Aug 11, 2020
@tigrannajaryan
Copy link
Member

So a few findings:

  1. It is technically possible to change the encoding of traceid/spanid to hex instead of base64 if we use Gogoproto Prototobuf library. It means it is doable in the Collector (I borrowed some code from Jaeger to do quick proof-of-concept).

  2. The official Go Prototobuf library does not seem to have any capability to customize JSON encoding, so everyone who uses official Protobuf libraries (and likely not just Go, but other languages as well) will not be able to use OTLP/ HTTP JSON.

  3. OpenTelemetry Javascript library appears to use manual encoding for traceid/spanid and supposedly it should be possible to modify it to use hex. @open-telemetry/javascript-approvers can you please confirm my understanding?

I believe Javascript SDK and the Collector are the 2 most important places where we want OTLP/HTTP JSON to be supported (in Javascript SDK because it is very natural to use JSON there and in the Collector because we have to since Javascript SDK can send to the Collector).

Now, the question is, do we care about other language SDK's? Our recommended encoding should be binary Protobufs for all languages. I do not see why any OpenTelemetry SDK exporter would want to use OTLP/HTTP JSON.

I am on the fence on this one, so would like more opinions. Do people think that OTLP/HTTP JSON needs to be universally supported across all OpenTelemetry or we can keep it as a niche implementation for the web / JavaScript (which grpc-gateway really is) and not even offer any implementations for the rest of the languages?

If the answer is that we want to support OTLP/HTTP JSON for all languages then I think we are stuck with standard base64 encoding because it is too much burden for every language SDK to figure out how to customize the encoding (and there may not always be a reasonable way to do it).

Thoughts?

@anuraaga
Copy link
Contributor

A text encoding of any protocol is very useful I think. One common use case is integration tests - storing payloads as text is much easier to version control than binary. And a test program could be any language. I have heard of other use cases where even in Java-based data pipelines they prefer to stick to JSON, I think since the pipelines have built in features that operate on it. So I definitely wouldn't recommend delegating text encodings to only the web.

Base64 in text protos would be a little strange but it's not the end of the world. But the best user experience would be a custom schema like opentracing, with the huge maintenance work that comes with it indeed, but to optimize for ux, I think that still needs to be on the table since I think ux always needs to be prioritized over maintenances cost.

@tigrannajaryan
Copy link
Member

@anuraaga good point regarding test data.

I would like to know what the maintainers of various OpenTelemetry SDKs think regarding how complicated it will be for them to support custom (hex) encoding.

@obecny
Copy link
Member

obecny commented Aug 12, 2020

JS supports all 3 different cases for node: grpc, proto over http and json over http, and for web we have json over http. When using json we send base64 for trace id and span id, all works fine.
For generating grpc and proto over http we are using libraries that converts that for us based on proto files - only json converts hex to base64. So in theory I would assume that those libraries will take care of that, if not then we will have to figure out if that is possible and how to.
Mentioned libraries are: for grcp @grpc/proto-loader and for proto/json protobufjs

@jkwatson
Copy link
Contributor

I believe this would be quite easy for Java.

@anuraaga
Copy link
Contributor

anuraaga commented Aug 12, 2020

I would like to know what the maintainers of various OpenTelemetry SDKs think regarding how complicated it will be for them to support custom (hex) encoding.

One way to phrase this is that we wouldn't use protobuf+json encoding anymore, it would be a custom OTel encoding protobuf+oteljson. This seems like a slippery slope, since at that point we may as well fully go into defining a custom JSON schema not tied to protobuf?

I believe this would be quite easy for Java.

We wouldn't be able to use protobuf anymore, I think, the base64 encoding is hard-coded here

https://github.com/protocolbuffers/protobuf/blob/master/java/util/src/main/java/com/google/protobuf/util/JsonFormat.java#L1237

We can write a custom marshaller, but at that point, I'd suggest not looking at proto at all at that point, for example attributes can be encoded in a more idiomatic way using a JSON object instead of the list of attributes.

@jkwatson
Copy link
Contributor

Fair enough. Although, if we were to change the types to String, then it would be quite easy, yes?

@anuraaga
Copy link
Contributor

Although, if we were to change the types to String, then it would be quite easy, yes?

Change the types to string in the proto? You know I'm always happy with that :P

@tigrannajaryan
Copy link
Member

Although, if we were to change the types to String, then it would be quite easy, yes?

Change the types to string in the proto? You know I'm always happy with that :P

I think that will be a breaking change, so we can't do it, Traces are already declared stable.

@tigrannajaryan
Copy link
Member

One way to phrase this is that we wouldn't use protobuf+json encoding anymore, it would be a custom OTel encoding protobuf+oteljson. This seems like a slippery slope, since at that point we may as well fully go into defining a custom JSON schema not tied to protobuf?

Correct, that's exactly what it would be. Do note that gRPC Gateway does not use "protobuf+json" Content-Type, it uses "application/json" so I do not think there is a strict expectation that the payload will be exactly following the Proto3 JSON mapping.

@reyang reyang added priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Aug 14, 2020
@tigrannajaryan
Copy link
Member

tigrannajaryan commented Aug 14, 2020

I would like to poll all other language SIGs to weight in regarding the feasibility of implementation. If any of the SIGs says this is not doable or very hard to do that I am inclined to close this as "wont do".

We have the opinions from JS and Java SIGs.

Other language SDK approvers, please tell if you are able to change encoding of TraceID and SpanID fields in the JSON encoding from base64 to hex. Pinging approvers: @open-telemetry/dotnet-approvers @open-telemetry/php-approvers @open-telemetry/rust-approvers @open-telemetry/go-approvers @open-telemetry/cpp-approvers @open-telemetry/ruby-approvers

Dear approvers, I apologize for the noise, but I'd hate to make a decision here which is then an unnecessary burden for you or creates an impossible requirement for you. One voice from each language SIG is highly appreciated. (I hope I didn't miss a language).

@obecny
Copy link
Member

obecny commented Aug 14, 2020

from base64 to hex ? it was opposite you can't have hex in json

@anuraaga
Copy link
Contributor

One point missing from the summary is comparing it to the option of defining a JSON protocol independent of protobuf. My reading of the comments is, for SDKs where there is no protobuf dependency, the change is trivial since they already encode the JSON themselves. For SDKs with a protobuf dependency, most likely it'll have to be replaced with manual encoding of the JSON. In that case, can't we make the UX even nicer by making the protocol take advantage of JSON idioms, such as using a standard JSON map for attributes? It also makes more obvious that it's a different protocol to prevent issues with accidentally base64-decoding the hex field.

@tigrannajaryan
Copy link
Member

can't we make the UX even nicer by making the protocol take advantage of JSON idioms ...

That's significantly bigger work both from specification perspective and possibly also more work for implementations which use Protobufs today and which have to do more than just changing base64 to hex encoding.

I agree that you have a good point but given our GA timelines I am very reluctant about anything that creates more work at this stage.

such as using a standard JSON map for attributes

I am not quite sure we would want to do that. I would not want to introduce significantly different potential semantics in the protocol. The protocol today allows recording multiple attributes of the same name (although it does not specify what it means). Multiple attributes with the same name are not allowed in the API today but in the future it may become a valid option. If that happens then JSON encoding will no longer be able to record everything that binary Protobuf encoding can record. I would not want to corner ourselves into incompatible encodings that are not capable of expressing precisely equivalent data.

If we want to explore this further then I would suggest that we remove the JSON representation from the specification for now and do not make it part of the GA so that we have more time to figure out the right answers. I do not know if is acceptable to GA without JSON representation. Particularly the JavaScript implementation may need this to be part of GA (I am not sure).

@anuraaga
Copy link
Contributor

Agree that if OTLP json is targeting GA, it could be too big of a change - I interpreted the current alpha status of it as not for GA but probably misinterpreted :)

Not a strong statement, but if JSON support is needed for opentelemetry-js in browser, perhaps browser GA can come later? We're lacking strength in testing Android support and I believe language support for iOS, so those could probably be rolled into a nice milestone. Just food for thought since worried about locking in the protocol too soon.

@tigrannajaryan
Copy link
Member

This issue is labeled release:required-for-ga as of now. I am open to moving this post GA if this is not causing problems for others. In that case we will have more time to explore the options carefuly.

Not a strong statement, but if JSON support is needed for opentelemetry-js in browser, perhaps browser GA can come later?

@open-telemetry/javascript-approvers @open-telemetry/javascript-maintainers can you please chime in as to when do you think you will GA and how important is JSON encoding support for your GA?

@dyladan
Copy link
Member

dyladan commented Aug 28, 2020

JSON encoding is very important to us because grpc/protobuf is a royal pain in the browser. Currently, JSON collector export is the only supported export mechanism in the browser, although there is an open PR (open-telemetry/opentelemetry-js#1399) to support zipkin in the browser as well. I would say we cannot make GA claims for the browser without it.

In terms of timeline, I would support making node and browser GA at the same time, because they share most of their code and I believe it would be confusing to users to have one GA and the other not. The JS tracing implementation has always tracked fairly close to the spec head so I think we would want to GA tracing fairly shortly after spec is finalized. Our major shortcomings actually have more to do with testing/docs than with the actual implementation.

@anuraaga
Copy link
Contributor

Thanks @dyladan that sounds like we should stick to the current plan of just replacing the IDs with hex. Out of curiosity, given JSON came from the browser maybe you have some extra insight, do you think the current JSON model is idiomatic enough for users to be comfortable with it?

@tigrannajaryan
Copy link
Member

Given JavaScript requirements and since there is no new evidence I suggest we move forward as I outlined above. One more time, the suggestion is that we go ahead and accept this change in the spec and then make corresponding changes in the implementations. I will mention this one more time in Maintainers meeting today and absent objections I will submit a PR that amends the spec accordingly.

tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Sep 1, 2020
Resolves: open-telemetry#786

See discussion and motivation for the change in the issue linked above.
tigrannajaryan pushed a commit to tigrannajaryan/opentelemetry-specification that referenced this issue Sep 1, 2020
Resolves: open-telemetry#786

See discussion and motivation for the change in the issue linked above.
SergeyKanzhelev added a commit that referenced this issue Sep 8, 2020
#911)

Resolves: #786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
SeanHood added a commit to SeanHood/opentelemetry-php that referenced this issue Jun 9, 2021
This PR Implements OTLP/HTTP (Protobuf)

As it stands the AlwaysOnOTLPExample works sending Traces to the Collector. The SpanConverter was pulled directly from the GRPC implementation.

OTLP/HTTP (Json) has an interesting history behind it and looks harder to support:
open-telemetry/opentelemetry-specification#786

TODO:
- [ ] Tests
- [ ] Support all config params from the constructor (compression, timeout, headers)
- [ ] Endpoint validation
SeanHood added a commit to SeanHood/opentelemetry-php that referenced this issue Jun 9, 2021
This PR Implements OTLP/HTTP (Protobuf)

As it stands the AlwaysOnOTLPExample works sending Traces to the Collector. The SpanConverter was pulled directly from the GRPC implementation.

OTLP/HTTP (Json) has an interesting history behind it and looks harder to support:
open-telemetry/opentelemetry-specification#786

TODO:
- [ ] Tests
- [ ] Support all config params from the constructor (compression, timeout, headers)
- [ ] Endpoint validation
bobstrecansky pushed a commit to open-telemetry/opentelemetry-php that referenced this issue Jul 21, 2021
* OTLP/HTTP First Pass

This PR Implements OTLP/HTTP (Protobuf)

As it stands the AlwaysOnOTLPExample works sending Traces to the Collector. The SpanConverter was pulled directly from the GRPC implementation.

OTLP/HTTP (Json) has an interesting history behind it and looks harder to support:
open-telemetry/opentelemetry-specification#786

TODO:
- [ ] Tests
- [ ] Support all config params from the constructor (compression, timeout, headers)
- [ ] Endpoint validation

* Fixes Phan Error

* OTLP/HTTP: The wait is over

* Tests tests tests
* Add headers supports
* Supports 4 of 6 Env Vars
* Rejects otlp/json (for now)
* Might support gzip (needs testing)

* Unrelated psalm fixes

* chore: make style

* OTLP/HTTP: Copy/paste SpanConverter Tests

I feel like the OTLP/HTTP and OTLP/GRPC Classes could be more integrated since their SpanConverters are identical

* OTLP/HTTP: Request body should actually send the payload!

* Return InvalidArgumentException when invalid headers are passed rather than silently fail

* Re work process headers function so it\'s called in the constructor rather than later at export

* This fixes a number of test with the span converter, basic types work: string, int, float, bool.

* This fixes converting arrays passed into attributes; and adds tests too
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Apr 20, 2023
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
tigrannajaryan added a commit to tigrannajaryan/opentelemetry-proto that referenced this issue Apr 21, 2023
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
tigrannajaryan added a commit to open-telemetry/opentelemetry-proto that referenced this issue Apr 27, 2023
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this issue Jun 21, 2024
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this issue Jun 21, 2024
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this issue Jun 21, 2024
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
VinozzZ pushed a commit to VinozzZ/opentelemetry-proto that referenced this issue Jun 21, 2024
…g (#911)

Resolves: open-telemetry/opentelemetry-specification#786

See discussion and motivation for the change in the issue linked above.

Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:miscellaneous For issues that don't match any other area label priority:p2 Medium priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:protocol Related to the specification/protocol directory
Projects
None yet