-
Notifications
You must be signed in to change notification settings - Fork 49
feat: add standardized Propagation Evaluation to Flag Metadata. #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Within a lot of system flag changes are propagated asynchronously to the consumers (not during evaluations). This propagations can be enhanced via spans and traces on their own to see how flag changes are distributed throughout the system. But in the end, as an enduser i am curious, which propagation event actually is linked to my evaluation. It is possible to somehow determine this via version, but spans and traces offer a deeper insight. If we standardize this behaviour, we can create out of the box OpenTelemetry Configuration Changed Listeners, to make this feature usable with all our providers.
java event listener example private static void onChange(EventDetails eventDetails) {
LOG.info("Provider configuration changed: {}", eventDetails.getEventMetadata());
if (eventDetails.getEventMetadata() == null) {
return;
}
String propagationTraceId = eventDetails.getEventMetadata().getString("propagationTraceId");
String propagationSpanId = eventDetails.getEventMetadata().getString("propagationSpanId");
if (propagationTraceId == null || propagationSpanId == null) {
return;
}
SpanContext parentContext =
SpanContext.createFromRemoteParent(propagationTraceId,
propagationSpanId,
TraceFlags.getSampled(),
TraceState.builder().build());
Tracer t = GlobalOpenTelemetry.getTracer("demo");
SpanBuilder sb = t.spanBuilder("flag updates");
sb.setParent(Context.current().with(Span.wrap(parentContext)));
Span span = sb.startSpan();
span.addEvent("someEvent");
span.end();
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea @aepfli!
From reading this I am not 100% sure how it would would exactly, left some questions.
The example answers some of the questions but from the spec I do not fully understand how it is meant to work.
|
||
### Propagation Metadata | ||
|
||
Feature Flags are propagated through different systems with different methods. Often this updates have an asynchronous nature to the evaluation and do not correlate directly (eg. cached values or in-process evaluations). For distributed systems it is important to reflect how changes are populate to all systems, and how those correlate with evaluations. In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data. Instead we are defining two additional metadata properties `propagationTraceId` and `propagationSpanId` which can be used to link evaluation spans to propagation spans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feature Flags are propagated through different systems with different methods. Often this updates have an asynchronous nature to the evaluation and do not correlate directly (eg. cached values or in-process evaluations). For distributed systems it is important to reflect how changes are populate to all systems, and how those correlate with evaluations. In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data. Instead we are defining two additional metadata properties `propagationTraceId` and `propagationSpanId` which can be used to link evaluation spans to propagation spans. | |
Feature Flags are propagated through different systems with different methods. Often these updates have an asynchronous nature to the evaluation and do not correlate directly to it (eg. cached values or in-process evaluations). For distributed systems it is important to reflect how changes in flag configurations are propagated to all systems, and how those correlate with evaluations. In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data. Instead we are defining two additional metadata properties `propagationTraceId` and `propagationSpanId` which can be used to link evaluation spans to propagation spans. |
|
||
### Propagation Metadata | ||
|
||
Feature Flags are propagated through different systems with different methods. Often this updates have an asynchronous nature to the evaluation and do not correlate directly (eg. cached values or in-process evaluations). For distributed systems it is important to reflect how changes are populate to all systems, and how those correlate with evaluations. In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data. Instead we are defining two additional metadata properties `propagationTraceId` and `propagationSpanId` which can be used to link evaluation spans to propagation spans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead we
I would not use first person here.
|
||
### Propagation Metadata | ||
|
||
Feature Flags are propagated through different systems with different methods. Often this updates have an asynchronous nature to the evaluation and do not correlate directly (eg. cached values or in-process evaluations). For distributed systems it is important to reflect how changes are populate to all systems, and how those correlate with evaluations. In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data. Instead we are defining two additional metadata properties `propagationTraceId` and `propagationSpanId` which can be used to link evaluation spans to propagation spans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
propagationTraceId
andpropagationSpanId
I have some things that are not fully clear to me from reading this:
How are these defined?
Do we typically fill these with the OTEL values? Which span do we use then? Trace would probably be the root one?
How are we setting them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The propagationTraceId and propagationSpanId are flexible. eg. in flagd we would create a trace and span for each GRPC event and add this information to the payload. eg. GRPC streams don't offer a way to propagate headers for events only, they have to be part of the payload. Other connection methods might autopropagate them if possible. But with the persistence in the metadata, we can link the propagating span to the evaluating span (which might be two totally different occasions/traces).
I am not the best one, when writing specs, this is my first attempt, and I am happy to explain my thoughts, and maybe this will help to solve the confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But with the persistence in the metadata, we can link the propagating span to the evaluating span (which might be two totally different occasions/traces).
I get that concept.
What I am not sure about is, how do we define all these things exactly.
E.g. for the propagating span
I can imagine multiple definitions, one of them being `the span of the incoming http request".
This span id e.g. will be different between all services, the traceparent id in otel might be the same between all of the services. This part is not shown in your example.
Maybe it is good enough to add 1 or 2 good examples for a good value for the ids.
|
||
### Propagation Metadata | ||
|
||
Feature Flags are propagated through different systems with different methods. Often this updates have an asynchronous nature to the evaluation and do not correlate directly (eg. cached values or in-process evaluations). For distributed systems it is important to reflect how changes are populate to all systems, and how those correlate with evaluations. In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data. Instead we are defining two additional metadata properties `propagationTraceId` and `propagationSpanId` which can be used to link evaluation spans to propagation spans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a simple manner the version could be used to achieve this, but offers additional and more complex solution to correlate the data.
Do you mean the version that we defined on the OTEL semconv?
What do you mean by "offers additional and more complex solution"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In appendix D, we specify that flag metadata can contain a version
https://openfeature.dev/specification/appendix-d#flag-metadata - theoretically,y we could use the version field too, to link the propagating span/trace, to the evaluation span/trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In appendix D, we specify that flag metadata can contain a version
Yeah, I mean, that I would love this to be a bit more clear that this version is used.
theoretically,y we could use the version field too, to link the propagating span/trace, to the evaluation span/trace.
Okay, but what what do you mean with: but offers additional and more complex solution to correlate the data.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use this version to correlate, but this proposal adds another attribute, which contains the traceid and spanid of the propagating cause.
eg. in flagd in-process, we have two non-connected steps for evaluations.
- Flag configurations are distributed asynchronously. When Flagd detects a change in the flag source, it sends out a new configuration, which will have a version attribute in the metadata.
- When I do evaluations, I can create a new span, and if I want to correlate this with the span of the propagation, I need to check for the version attribute, and I don't have a direct link. Otel offers to link traces, to represent a correlation. With the information of the trace and span, we can create this link. And create a holistic image from propagation span to evaluation span, and how it is linked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR is attempting to recreate span links. Any reason you chose to go this route rather than using that mechanism? https://opentelemetry.io/docs/concepts/signals/traces/#span-links
yes i want to use span-links, but somehow i need to propagate this data/information through the system. A grpc stream does have headers, but they are only set upon initialization. so if i want to track a grpc message as an own trace. or ideally from the propagation start till the end, i need to somehow pass on the information of the span/trace with the message. i have not found another solution for that, but i am also not as familiar with openTelemetry. my research did not help nor provided more insights. |
Within a lot of system flag changes are propagated asynchronously to the consumers (not during evaluations). This propagations can be enhanced via spans and traces on their own to see how flag changes are distributed throughout the system.
But in the end, as an enduser i am curious, which propagation event actually is linked to my evaluation. It is possible to somehow determine this via version, but spans and traces offer a deeper insight. If we standardize this behaviour, we can create out of the box OpenTelemetry Configuration Changed Listeners, to make this feature usable with all our providers.
This PR
Related Issues
Relates: open-feature/flagd#1595
Notes
Follow-up Tasks
How to test