This repository has been archived by the owner on Dec 6, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stabilize messaging semantic conventions for tracing #192
Stabilize messaging semantic conventions for tracing #192
Changes from 1 commit
03d8291
13c975d
8f4349f
98cf8f3
2d8f25c
f48379f
5e67b1f
7b2dbe9
f23c5ac
51501ce
20f272c
1e14ff0
b87adf6
20887d2
2ae923d
0e8f824
161392e
f40ff57
523c435
4d341f6
4615994
71e5ab8
b316b1a
bcd328e
0e4a777
d91ac98
40c1d13
e2738c6
9055167
8e663f0
7c1d84c
38dc813
92a1dd5
8dc5b9f
72e1215
e4ccef0
5fa715d
8b4461a
964d65e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding the destination name will always be required, even if the broker doesn't support multi-tenancy (as
messaging.destination.name
will be the equivalent of what is currently put intomessaging.destination
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done some research on
messaging.destination.name
:Also, Pulsar that have tenant/namespace/topic structure, does not really care about it on the client API level. E.g,
public/dlt-example/dlt-example-topic
is passed as a string to the producer. I believe the whole thing should go to themessaging.destination.name
With this in mind,
messaging.destination.name
should be a conditionally required attribute (when available and when all messages in a batch are published to the same topic).And the contract is that it uniquely identifies topic/queue/subject/entity within either
net.peer.name:net.peer.port
(not available for GCP, AWS or vanilla JMS) orcloud.account.id
+cloud.region
for AWS - check out AWS lambda otel samplescloud.account.id
for GCPIt's probably hard to build UI with so few guarantees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to notes discussed over the call: https://gist.github.com/lmolkova/1bdcb0cd56ef876f278c5d9ba8fa7b08
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For cloud messaging systems, I think we should go for the cloud-native resource ID, like the full ARN in AWS (see faas.id spec in current specification, it has some general wording that could be extracted / reused here). For AWS SQS you will, however, often only know the Queue URL instead of the full ARN (messaging.url). So it could be a one-of requirement, or we allow both as messaging.destination.
I think you end up needing messaging.system anyways to fully interpret that name.
Alternatively, we could separate destination.name and destination.id, where the first is something suitable for display and the second is something suitable for identifying the destination e.g. to show which operations involve the same queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to see if we can take a step back with this. It seems we have the following clear facts:
Given those facts, couldn't we leave this still as
Required
? We then give some hints on what instrumentation should usually put in here, but ultimately it's on them to know exactly how this should be composed.I feel we are running into circles with this attribute. Maybe we should make it more flexible and allow each individual messaging system SDK/Instrumentation do what's best for them. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely needs a per-system definition, much like faas.id, db.name, etc.. But I don't think each instrumentation should decide on their own, at least not when it is likely that there will be different instrumentations for the same messaging system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked which systems/client libs support producer-side templates and it seems the only one that supports it in some shape is RabbitMQ through routing keys. I suggest we don't try to come us with a common attribute here and let Rabbit have an attribute for routing_key.
Systems I checked that don't have producer-side templates: JMS, Kafka, ServiceBus, EventHubs, NATS, Pulsar, RocketMQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should discuss this more. In the existing conventions, this is required to be set if the destination kind is either "queue" or "topic".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the OTEP text should document why the decision was made to split the "destination" attributes were split into "destination" and "source" (and maybe reconsider).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the terminology used by different systems on this and found that it's either
wildcard
(NATS, MQTT Clients) orpattern
(kafka, pulsar). I suggest we stick to existing terminology.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for
wildcard
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll start off by saying I'm not thrilled with the term
template
myself. But I don't think thatwildcard
is a good choice either, since it is sort of the opposite of I think we should try to achieve with template.pattern
sounds a bit better, but it is tightly coupled to the wildcard concept, which I hope to show below is not a good match for this attribute. Perhapsschema
is a better term?What I think we should be trying to achieve with
template
is to document high cardinality topic levels or fragments. Wildcards express that the consumer wants to receive all values of a topic level or fragment, but it doesn't explain what the topic level is. Why is this useful? I can think of two reasons:template
orschema
) or to better understand the semantic meaning of the topic in a trace.template
orschema
is available, it would be low cardinality and perhaps we would decide it is useful in a span name. We still have to discuss span names, so I think further discussion of this point should be saved for when we get to the point of discussing span names as a group.To illustrate these points, suppose you have a topic hierarchy that looks something like this:
myCompany/vehicles/{vin}/position/{gps-latitude}/{gps-longitude}
A message might be published with a topic of:
myCompany/vehicles/4Y1SL65848Z411439/position/45.4236/75.7009
To someone not familiar with the topic schema being used, this might be hard to decipher. Furthermore, such a field name is not suitable for use within a span name due to its high cardinality. Including the template or schema into the trace helps with both of these problems.
Where the concept of a wildcard fits into the example above is that an MQTT client for example may subscribe to messages like this with a subscription of
myCompany/vehicles/#
ormyCompany/vehicles/+/position/+/+
. In the context of the consumer, the template concept is still useful. The subscriptionmyCompany/vehicles/#
doesn't say much to someone looking at the trace. However, including the template or schema for the topic allows someone to see the information that the consumer can derive from the topic of the received messages.For many messaging systems, this concept is not embedded within the messaging system itself, it is only an application layer concern. This means it would be impossible for many auto-instrumented SDK's to include this. However, might it be possible for an application to co-operate with auto-instrumentation and add this attribute to a span started by the SDK?
Regardless, as messaging is becoming more widespread in large organizations, there is an evolving need to maintain a registry of events so that they can be tracked and governance policies can be applied to them. Newer initiatives like AsyncAPI I believe facilitate this with the concept of channel parameters. I believe auto-instrumentation added to an AsyncAPI implementation may be able to inject this information automatically.
To sum up, I think this concept is very useful to support. However, coming up with the right name is tricky. Let's discuss further during our next meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree it's both a 1) important concept for messaging systems that support dynamic topics and 2) a difficult concept name properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we know of four different messaging systems supporting this feature (thanks @lmolkova for checking), and there are four different terms for this feature: pattern, wildcard, template, and schema.
Another possible solution would be to make this a messaging system specific attribute. I think this would be justified, as the majority of system do not support it. Kafka would have an attribute
messaging.destination.pattern
, while NATS hasmessaging.destination.wildcard
. For a Kafka user who is using a pattern the meaning will be immediately clear, whereas they might be confused by the term "wildcard", "template", or "schema".There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of any messaging system other than AsyncAPI (which isn't so much a messaging system as it is an API on top of messaging systems) that formally supports the concept of describing high cardinality topic levels, which is what my intent was when I suggested adding this attribute. I don't see how a messaging system's term for a wildcard concept is a good term to use in naming this attribute.
The connection I can draw between this parameter and a messaging system's support for wildcarded subscriptions is that support for wildcarded subscriptions makes it more useful for a publisher to inject high cardinality data into the topic, knowing subscribers have the opportunity to wildcard high cardinality portions of the topic. But I don't view this as wildcards, but rather a description of portions of a topic that subscribers might be likely to wildcard (keeping in mind it's perfectly valid to apply wildcards to low-cardinality topic levels as well, and also perfectly valid to not wildcard high cardinality topic levels). So I don't view this parameter as the same concept as wildcards, it's only loosely related to wildcards.
Auto-instrumentation for messaging systems that support high cardinality topics would not be able to include topics in span names. This will make their span names much less useful than messaging systems that don't support high cardinality topics. If an application co-operates and includes this information (or if an API like AsyncAPI is used that supports the concept natively), then these messaging systems can take advantage of using this low cardinality attribute in the span name.
As mentioned earlier, the only existing formalization of this concept I'm aware of is AyncAPI, which refers to named fragments of the channel name as "channel parameters". I think using this term would be a poor choice here. However, AsyncAPI's channel is analogous to a topic, so if we really want to use an existing term, I could see us using a term like "parameterized_topic_description", but that is quite awkward.
Perhaps we can discuss more in our next meeting. There may be a disconnect on what we are trying to achieve with this parameter, or I may not be understanding how the wildcard concept is so tightly coupled to this attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point and a strong argument for making this a generic attribute, instead of leaving it to messaging systems to add their own specific attributes (as I suggested above).
With this generic attribute and concept in place and well-defined, we could change the requirements for the span name as follows:
Thus, we could avoid the vague language in the present semantic conventions: