-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surface new labels for uninstrumented services and systems #3543
Conversation
modules/generator/processor/servicegraphs/servicegraphs_test.go
Outdated
Show resolved
Hide resolved
@@ -190,6 +190,17 @@ func (p *Processor) consume(resourceSpans []*v1_trace.ResourceSpans) (err error) | |||
e.ServerService = dbName | |||
e.ServerLatencySec = spanDurationSec(span) | |||
} | |||
|
|||
if p.Cfg.EnableExtraUninstrumentedServicesLabels { | |||
switch e.ConnectionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is mutually exclusive, but is it possible that a service can be both database and messaging system? eg Redis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is - because at this point it has been decided what the service is acting as, depending on the presence/absence of the db.name
resource attribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few initial questions. other thoughts:
- what occurs if there are conflicts between these new values and configured span dimensions?
- what do you think about pooling edges in the store?
Thanks for reviewing Joe. I've added some testing to cover "conflicts" with manually added Could you be more specific about the edge pooling? Happy to address that if you think there's an issue there. (Sorry, I'm just new to Tempo in general) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should make this a per-tenant override instead. If this feature is behind a global config parameter, it makes me think that we should just add it to the default generator without a flag. What's the use case for enabling or disabling it globally?
It actually makes more sense as a per-tenant override instead, yes. I've also been discussing with Joe and I will:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc changes look good. Thank you for adding docs!
These are some benchmarks about
There's probably better numbers to be had at scale, since these are limited to the testdata size. |
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>
@@ -86,8 +86,20 @@ type Processor struct { | |||
|
|||
func New(cfg Config, tenant string, registry registry.Registry, logger log.Logger) gen.Processor { | |||
labels := []string{"client", "server", "connection_type"} | |||
|
|||
if cfg.EnableVirtualNodeLabel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we add this to the list of labels above? this will prevent us from having to add lines 96-102 which adds special handling for this value.
then we could add a new param to onComplete
where we pass the string from onExpire
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely doable - however what you save in these lines to avoid the prefixes, is added back to the onComplete logic for slice allocation and label value assignment.
also, it'll make this new label behave as half-dimension half-edge-property. do you think that'd be ok?
can you also explain further what's the modification of the onComplete signature for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, i see. i was trying to remove some of the state wrangling in the consume()
method but you've already completely removed it.
agree that the tradeoffs mentioned:
what you save in this lines to avoid the prefixes, is added back to the onComplete logic for slice allocation and label value assignment.
are not worth it. let's keep it as implemented now.
also, it'll make this new label behave as half-dimension half-edge-property. do you think that'd be ok?
i'm not seeing an issue with that. do you have some concerns in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok agree, leaving it behaving as before, as a dimension.
I think it's better to "treat it" as if it's fully one thing or the other. What I mean by that, from a future reader standpoint, it'll be easier to follow logic if it's treated as a dimension or as a struct property (first-class label?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs update, feature addition, a perf improvement. Love it. Thank you!
@@ -52,6 +52,8 @@ const ( | |||
metricRequestClientSeconds = "traces_service_graph_request_client_seconds" | |||
) | |||
|
|||
const virtualNodeLabel = "__virtual_node" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the __
prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked for this. Should we drop it? Just trying to avoid conflicts with actual span attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be OK with and without the __
prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In spanmetrics we do the opposite, if there's a collision with an instrinsic dimension we add __
(code). I don't feel too strongly about this. I prefer it that way, butI'm OK with what you decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - then to not raise any questions to future readers, and because it seems a collision is unlikely, I'll leave it as without the __
prefix.
It'd be good in the future to identify and sanitize in the same way as in spanmetrics, happy to work on that small improvement in a future PR if you need me.
Co-authored-by: Mario <mariorvinas@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with the PR, except that one comment. Nice work!
The edge is not expired here, so it shouldn't be returned to the pool. Co-authored-by: Mario <mariorvinas@gmail.com>
What this PR does:
For the servicegraphs processor:
traces_service_graph_request_*
:virtual_node
with a value ofclient
orserver
, so the un-instrumented service's node can be identified.client|server_db|messaging_system
with the corresponding resource attribute value, to surface and identify the service and the node.Which issue(s) this PR fixes:
Fixes #3461
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]