Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow jaeger supported connections to service graph #29148

Closed
trajano opened this issue Nov 13, 2023 · 6 comments
Closed

Allow jaeger supported connections to service graph #29148

trajano opened this issue Nov 13, 2023 · 6 comments
Labels

Comments

@trajano
Copy link

trajano commented Nov 13, 2023

Component(s)

connector/servicegraph

Is your feature request related to a problem? Please describe.

I used to use Jaeger to generate a service graph for the most part I didn't have to do anything special, but when I am using the servicegraph component it appears to be restricted as denoted here

It currently supports the following requests:
* A direct request between two services where the outgoing and the incoming span must have `span.kind` client and server respectively.
* A request across a messaging system where the outgoing and the incoming span must have `span.kind` producer and consumer respectively.
* A database request; in this case the connector looks for spans containing attributes `span.kind`=client as well as db.name.

Describe the solution you'd like

Just make it generate similar graphs to Jaeger.

I think a good one would be to simply check whether a trace contains a span that goes from one service.name to another. This is one that Jaeger would support
image

Here's another one with Spring Boot connecting to Redis where net.sock.peer.name can be used as a fallback to db.name to specify a remote non-tracked service (note Jaeger doesn't support this one either)

image

Describe alternatives you've considered

Go back to Jaeger

Using something like this (but that didn't work)

  transform/set_span_kind_client:
    trace_statements:
      - context: span
        statements:
          - set(attributes["kind"], "client") where attributes["kind"] == nil

Additional context

I do get data specifically from Grafana services

# HELP traces_service_graph_request_total 
# TYPE traces_service_graph_request_total counter
traces_service_graph_request_total{client="grafana",connection_type="",failed="false",server="prometheus"} 540

So I know it's working.

@trajano trajano added enhancement New feature or request needs triage New item requiring triage labels Nov 13, 2023
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@trajano
Copy link
Author

trajano commented Nov 13, 2023

I think an easy way is to remove the requirement that the source is a client so you'd allow a connection when it is server to server. I am presuming the change would be around here

func (p *serviceGraphProcessor) aggregateMetrics(ctx context.Context, td ptrace.Traces) (err error) {

Maybe something like

switch span.Kind() {
  case ptrace.SpanKindProducer:
    ...
    fallthrough
  case ptrace.SpanKindConsumer: // this is shifted up
    ...
    fallthrough
  case ptrace.SpanKindServer: // this is shifted up
    ...
    fallthrough // fall through since server can be client as well.
  case ptrace.SpanKindClient:
    ...

@trajano
Copy link
Author

trajano commented Nov 13, 2023

For the connection using net.sock.peer.name it can be done using

  transform/set_db_name
    trace_statements:
      - context: span
        statements:
          - set(attributes["db.name"], attributes["net.sock.peer.name"]) where attributes["db.name"] == nil and attributes["db.system"] != nil

@mapno
Copy link
Contributor

mapno commented Nov 21, 2023

Hi! Not using kind to identify requests is something we've considered in the past, but haven't implemented. We've preferred adhering to OTel semantic conventions, which define that spans must be tagged with kind when they're part of a request (ref).

I'm not opposed to a different approach, but there are tradeoffs that need to be considered first: not using kind is way more resource intensive than doing so, as any span can be a client or a server—which results in a lot more data buffered and more calculations. If you have a proposal in mind, I'm happy to review it.

Finally, theres is a feature called virtual nodes that could help you with some of the cases you mention. The docs linked are from Grafana Tempo, but it's the same implementation.

@crobert-1
Copy link
Member

@trajano Have you had a chance to check out virtual nodes as @mapno referenced? Do you have other thoughts on @mapno's response here?

@trajano
Copy link
Author

trajano commented Dec 10, 2023

Looks fair but I stopped using Jaeger for tracing now and switched to Tempo so I am closing this. Maybe someone who is still using Jaeger can try it out.

In addition traefik/traefik#10223 may solve the issues as well since this fixes a lot of the OTel stuff.

@trajano trajano closed this as completed Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants