-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define API compatibility story #115
Define API compatibility story #115
Conversation
Any application or third party library that uses OpenTelemetry API of a particular version can safely be packaged with OpenTelemetry SDK implementation provided that these 2 conditions are met: | ||
|
||
1. The full version number of the SDK is not lower than the full version number of API used by application or third party library. | ||
2. The major version number of the SDK is the same as major version number of API or major version number of the SDK is higher by 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: do we want to provide stronger guarantees across more than one major version number? Or perhaps we don't want to provide compatibility guarantees across major versions at all?
I felt that one major version apart is a good approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to tightly couple API version with SDK version? We may need to do major rev of SDK because some of its SPI interfaces change, while the API remains unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not tightly couple API and SDK versions. The rationale is that you cannot expect all third party libraries that a developer may use in an application to all have been instrumented against the same OpenTelemetry API. Third-party developers move at their own pace, you usually move at your own pace on your own app. There has to be a way for mixed-version usage. At the minimum code linked to APIs across minor versions should be interoperable, but I think that is not good enough and 1 major version apart is the minimum guarantee that is necessary for real-life usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on leaving this as a TODO for the time being so we can have a more full-fledged discussion without blocking the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the current wording is still "tight coupling" to me, as it implies correlation between API and SDK versions. To take example of OpenTracing: Python API is v2, Jaeger SDK is v4. Granted that in OpenTelemetry the API and SDK are more closely related, but there are still potential major versions of SDK (e.g. changing from JS to Typescript as the original source) that do not affect major rev of the API.
This is separate from the point that a version of SDK must support some V and V-1 of the API.
|
||
The above is specified using Java's convention of reverse domain name notation. Other languages will use their equivalent convention for namespaces, e.g. for Go the package name for version 1.x.x of the API will be go.opencensus.io, for version 2.x.x of the API it will be go.opencensus.io/v2. | ||
|
||
## Interoperability of Mixed-version Code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how this would work. This might work for libs that are used at the leaves of the call graph, but what if we have server middleware storing v2 span in context, and http client lib middleware that expects v1 span?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this? I am not sure I fully understand the example that you brought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the picture, it is clear now.
I do not see this as a problem that is induced by the decision to support multiple API versions in parallel in OpenTelemetry and let me explain why I think so.
Based on your description I am going to assume that Context is a container that has a pair of GetSpan/SetSpan methods that allow to store and retrieve arbitrary untyped value of Span (an Object in Java's parlance or interface{} in Go's).
In this situation I believe the problem is a result of incorrect assumptions of Client which reads an untyped data type from Context and treats it as if it has a certain type when in fact it can be of any type.
If the Context is a container of untyped data types where I can store 2 unrelated data types (io.opentelemetry.v2.Span and io.opentelemetry.v1.Span) then users of this container (namely Handler and Client) cannot expect that the untyped container will magically solve their interoperability problem.
If Handler and Client want to interchange data the responsibility is on them to define a data exchange contract. I fail to see how it can be the responsibility of generic Context class which is merely a carrier of untyped data.
Now, if we assume the opposite, i.e. that Context is a strongly typed container and supports GetSpan/SetSpan strongly typed methods then this problem cannot exist because Context itself must be versioned, so we will have io.opentelemetry.v1.Context and io.opentelemetry.v2.Context, each having a pair of GetSpan/SetSpan methods that work only with corresponding data type. In that case the SDK can ensure interoperability of v1.Context and v2.Context types by converting v1.Span into v2.Span internally (but I am not sure it should try to, that's a separate question worth thinking about).
I may be wrong, happy to hear your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I created an example of what OpenTelemetry API V2 implementation can look like, so that it provides a shim for API V1 and how exactly it can be used in the scenario that you described: with a middleware that uses API V1 and an app client that uses API V2.
It shows how the shim ensures the interoperability by mapping the API V1 Span methods to API V2 Span methods as needed.
Check this out: https://github.com/tigrannajaryan/otelapi-compat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro Here is the implementation of opentelemetryv1.FromContext
:
https://github.com/tigrannajaryan/otelapi-compat/blob/849e405dcddf73ca2f1cfdc429d03dde504dd9cc/opentelemetryv1/tracer.go#L37
Imaginary database call that creates v1 client-span (initiated from v2-instrumented call): https://github.com/tigrannajaryan/otelapi-compat/blob/849e405dcddf73ca2f1cfdc429d03dde504dd9cc/myapp/main.go#L35
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work with go modules? The v2 API needs to include a v1 package whose implementation is different from the real (old) v1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(OK, now I can reply to the comment, strange, so copying my reply here).
There are probably different ways to achieve this in Go. One approach is to have an additional indirection level from API to SDK. This follows from the the requirement to be able to swap out SDK implementations laid out here: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/library-guidelines.md
If you have an indirection layer in API v1 which allows registering implementation at runtime, then the code in SDK v2 will register its own implementation of API v1 at startup.
I am not sure if there is a way to perform this swapping in Go at compile time, at least I cannot see it yet.
For Java this is easier to do since you can just provide a different jar which implements the exact same API in a different, so there is no need to have runtime swapping, it can happen at compile time.
Just to reiterate, this is not a requirement that comes from this PR, it is something that has been agreed to be part of the spec in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can modify my example source code to illustrate what I mean if the description is not clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro I modified my toy example at https://github.com/tigrannajaryan/otelapi-compat/ to demonstrate how API and SDK can be decoupled and how v2 SDK can implement both v1 and v2 APIs and make it all work on mixed-code app without touching third-party library code.
Any application or third party library that uses OpenTelemetry API of a particular version can safely be packaged with OpenTelemetry SDK implementation provided that these 2 conditions are met: | ||
|
||
1. The full version number of the SDK is not lower than the full version number of API used by application or third party library. | ||
2. The major version number of the SDK is the same as major version number of API or major version number of the SDK is higher by 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on leaving this as a TODO for the time being so we can have a more full-fledged discussion without blocking the PR?
Everyone, I would like to have a discussion about API compatibility in the next week's specification SIG meeting. I believe it is important to define this early, otherwise compatibility issues will bite us back in the future. Please join the meeting on Tue, June 18, 8am PT. Agenda and Zoom link: https://docs.google.com/document/d/1-bCYkN-DWJq4jw1ybaDZYYmx-WAe6HnwfWbkm8d57v8/edit#heading=h.v6lhw87olr5l |
1. The full version number of the SDK is not lower than the full version number of API used by application or third party library. | ||
2. The major version number of the SDK is the same as major version number of API or major version number of the SDK is higher by 1. | ||
|
||
For example, an application that uses API 1.5.10 can be packaged with SDK 1.5.12 or SDK 2.0.0, but it cannot be packaged with SDK 1.0.0 (too old) or with SDK 3.0.0 (too new). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would 3.0 be too new for 1.5 but 2.0 would be fine? They could both break things in equally bad ways right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more old versions you decide to support the more maintenance work it requires. I believe supporting one previous major version is a good balance, it gives people who depend on OpenTelemtry enough time to upgrade, yet allows us to have manageable amount of old / deprecated functionality that we need to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside the approach we end up taking, I do agree on @tigrannajaryan approach: one previous major version to support.
Reviewers, I think I addressed all comments. Please have a look again. If there are no other questions I'd like to merge this and address any outstanding issues separately (I filed a couple as a result of discussions we had on this PR). |
|
||
New functionality in minor version increases may be added in the form of new classes, methods or functions in a way that does not affect existing functionality. | ||
|
||
## Compatibility of SDK Implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be separating API and SDK specs.
|
||
The above is specified using Java's convention of reverse domain name notation. Other languages will use their equivalent convention for namespaces, e.g. for Go the package name for version 1.x.x of the API will be go.opencensus.io, for version 2.x.x of the API it will be go.opencensus.io/v2. | ||
|
||
## Interoperability of Mixed-version Code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work with go modules? The v2 API needs to include a v1 package whose implementation is different from the real (old) v1.
@yurishkuro For some reason I cannot reply to this comment directly, so replying here. There are probably different ways to achieve this in Go. One approach is to have an additional indirection level from API to SDK. This follows from the the requirement to be able to swap out SDK implementations laid out here: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/library-guidelines.md If you have an indirection layer in API v1 which allows registering implementation at runtime, then the code in SDK v2 will register its own implementation of API v1 at startup. I am not sure if there is a way to perform this swapping in Go at compile time, at least I cannot see it yet. For Java this is easier to do since you can just provide a different jar which implements the exact same API in a different, so there is no need to have runtime swapping, it can happen at compile time. Just to reiterate, this is not a requirement that comes from this PR, it is something that has been agreed to be part of the spec in the past. |
@tigrannajaryan I appreciate the work you are doing here, really, I don't want to detract from the fact you are writing documentation on the current design. Thank you. That said maybe you can shed some light on why we are defining this at a software integration level. This is going to require massive engineering efforts and is already influencing "plugin" based design in the client libraries like open-telemetry/opentelemetry-go#19. This specification you are writing is essentially going to devolve into an aggregation of I am trying to follow these issues and conversations but I just can't seem to find what the objections to defining the "Tracer Transport Protocol" (the name is for illustration) whatever that may be. If we can take a small bit of time to do this, it lowers the work across the board and sets a solid foundation to build on. Most importantly it doesn't stop OpenTelemetry at that point to writing monolithic SDK's which integrate many vendors and platforms exactly as they are today. But it does provide a standard specification for users that disagree with such a design from having to integrate with them. What are you thoughts on:
If wants to support the SDK's they can, just as they do under the current design. However they now have the option to add a flag on their existing server endpoints to accept those trace formats Small note: I would be more than happy to contribute to helping define the specification if I knew it wouldn't be dismissed due to we can always do that later. |
Reviewers, please comment. I'd like to move forward. If this is proposal is acceptable in principle let's merge it and we can improve it in additional PRs. If it is not acceptable let's reject it, I am OK with that too. I strongly believe that we need to have a well-defined compatibility story (not necessarily exactly the story suggested in this PR). The story needs to be publicly available. People who evaluate OpenTelemetry for use in their applications or libraries will look into our compatibility guarantees as part of their evaluation. Not having clearly articulated compatibility specification is not helping us. |
@yurishkuro as discussed I have put together the Java version of the API/SDK version compatibility toy demo: https://github.com/tigrannajaryan/otelapi-java-compat It is less elaborate than the Go version (e.g. Context propagation is not implemented) but I believe the idea is clear anyway. Please let me know if you have any other concerns. |
84ccbd5
to
37f2fdf
Compare
@tigrannajaryan can you move this forward? Or we can abandon this PR for now |
@SergeyKanzhelev let me think a bit about how this will work for our recently agreed version labeling and I will post an update to this PR or ask for postponing it. |
@tigrannajaryan I'll close this PR for now than. It shouldn't be a problem to re-open it. |
ok, thanks @SergeyKanzhelev |
…n-telemetry#115) * Update CODEOWNERS to add docs team as reviewers for docs changes As discussed in Slack, this PR aims at standardizing docs team as reviewers for all changes concerning documentation in our repos. * Update specification/templates/.github/CODEOWNERS Co-authored-by: Robert Pająk <pellared@hotmail.com> Co-authored-by: Robert Pająk <pellared@hotmail.com>
We need to define the API compatibility story, particularly:
Go toy example: https://github.com/tigrannajaryan/otelapi-compat
Java toy example: https://github.com/tigrannajaryan/otelapi-java-compat
Implements #109