Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[loadbalancingexporter] Not properly batching service traces #13826

Open
crobertson-conga opened this issue Sep 1, 2022 · 6 comments
Open

[loadbalancingexporter] Not properly batching service traces #13826

crobertson-conga opened this issue Sep 1, 2022 · 6 comments
Assignees
Labels
bug Something isn't working exporter/loadbalancing never stale Issues marked with this label will be never staled and automatically removed priority:needed Triagers reviewed the issue but need code owner to set priority

Comments

@crobertson-conga
Copy link
Contributor

crobertson-conga commented Sep 1, 2022

Describe the bug
New loadbalancingexporter option for grouping traces by service name is sending all traces in a block every time instead of splitting up the set of traces to those that only belong to the specific export.

Steps to reproduce
Use new routing_key: service option to start splitting up the traces by service. Have at least 2 receiving collectors. In the receiving collectors, use a resource detection processor to augment the trace payload so you can see which collector is receiving a trace.

What did you expect to see?
All traces from a specific service name should have the same receiving processor

What did you see instead?
Traces from a specific service name went to both processors.

What version did you use?
0.59.0

What config did you use?

      loadbalancing/spanmetrics:
        routing_key: service
        protocol:
          otlp:
            tls:
              insecure: true
        resolver:
          dns:
            hostname: <some_k8)sservice_to_target_collectors>
            port: 4317
            interval: 1m

Environment
Doesn't matter

Additional context
Add any other context about the problem here.

@crobertson-conga crobertson-conga added the bug Something isn't working label Sep 1, 2022
@crobertson-conga crobertson-conga changed the title [loadbalancingexporter] [loadbalancingexporter] Not properly batching service traces Sep 1, 2022
@crobertson-conga
Copy link
Contributor Author

@aishyandapalli this is an FYI, I think your new feature in regards to #12421 has a bug in it. I think its stemming from

consuming all traces instead of just the ones associated with the routing key.

@crobertson-conga
Copy link
Contributor Author

crobertson-conga commented Sep 1, 2022

Actually I'm not sure if that's the problem. I set up batching so that it had a max size of one and all my span metrics collectors are still getting signals across all services
Screen Shot 2022-09-01 at 7 10 15 PM

      batch/one: # super inefficient data-wise, but it looks like the loadbalancing exporter doesn't split properly
        send_batch_size: 1
        send_batch_max_size: 1

I have a resource processor on the span metrics processor that annotates the traces coming in, hence the aggregator dimension.

The collector doing span metrics is forwarded metrics from the loadbalancer exporter.

[Edge collectors] -> [Main central collector] -> [Spanmetrics collector(s)]

@crobertson-conga
Copy link
Contributor Author

So doing some more testing leads me to believe it may be due to forcibly closed connections on grpc making the LB move to the next available instance. I will close this if it turns out to be the case.

@crobertson-conga
Copy link
Contributor Author

Okay, so this was due to my configuration which was interrupting the grpc connection regularly. Sorry

@crobertson-conga
Copy link
Contributor Author

crobertson-conga commented Sep 2, 2022

Okay after removing my batching of size one, the issue reappeared. I had two problems, one which is resolved by not allowing connections to terminate artificially. The other is if the traces are in a batch with multiple service names, they get get sent to all target collectors with the loadbalancer processor.

This leads me to believe the original issue where all the spans are being sent per endpoint regardless of actual service is correct.

@evan-bradley evan-bradley added the priority:needed Triagers reviewed the issue but need code owner to set priority label Sep 9, 2022
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working exporter/loadbalancing never stale Issues marked with this label will be never staled and automatically removed priority:needed Triagers reviewed the issue but need code owner to set priority
Projects
None yet
3 participants