Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Suppression flag to context #2821

Merged
merged 25 commits into from
Mar 26, 2025

Conversation

cijothomas
Copy link
Member

@cijothomas cijothomas commented Mar 18, 2025

This establishes the basic building block required to suppress telemetry using Context. This is a pre-requisite for solving issues like #1171 or #761
This is not yet applied anywhere, that can be in follow ups.
(I did try in Logs Sdk, and there is a 1-2 ns additional overhead to check this. It may not look much, but given Logs SDK can check IsEnabled() currently under 2 ns, this is a measurable increase in terms of percentage!. But this should not be a major concern.)

Previous attempts: #1330

@cijothomas cijothomas requested a review from a team as a code owner March 18, 2025 00:56
Copy link

codecov bot commented Mar 18, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.9%. Comparing base (f3e93a0) to head (e13c2c6).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #2821     +/-   ##
=======================================
+ Coverage   80.7%   80.9%   +0.1%     
=======================================
  Files        124     124             
  Lines      23703   23833    +130     
=======================================
+ Hits       19150   19281    +131     
+ Misses      4553    4552      -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cijothomas cijothomas mentioned this pull request Mar 18, 2025
@cijothomas cijothomas marked this pull request as draft March 18, 2025 23:07
@cijothomas cijothomas marked this pull request as ready for review March 22, 2025 18:32
Copy link
Contributor

@bantonsson bantonsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the addition is good, but you should run the benchmarks on this change. Locally for me this gives between 63% and 173% slower execution of the context_attach benchmarks.

I have some ideas of how to remedy this.

@cijothomas
Copy link
Member Author

I think that the addition is good, but you should run the benchmarks on this change. Locally for me this gives between 63% and 173% slower execution of the context_attach benchmarks.

I have some ideas of how to remedy this.

Thanks! Yes it is somewhat surprising to me too that the attach benches regress (I do see 50-80% slow).
We can keep improving the perf with pure internal optimizations, give we expose no internals.

Additionally, sharing what I believe must be the highest perf path:

  1. Checking if current is in telemetry_suppressed -- critical, as every hot path (logger.emit, span.start etc.) has to check this.
  2. Check if there is a Span in the current and is that Span is being recorded - critical as this will allow users to quickly short circuit code, if there is no active span and code has no intent to start new one.

Enter suppression_context - is done by background thread/exporters only and is less important to be as performant as others.
Baggage - important, but less important as Baggage is anyway propagated via HTTP headers, so there is non-trivial cost already for Baggage, so adding few nanoseconds should be okay.

@cijothomas
Copy link
Member Author

@utpilla and I discussed another thought: When one is entering suppression mode - is there a value in keeping/cloning existing context? What good is baggage or active span, if we are in suppressed mode anyway?
Won't be exploring this idea for this PR, but will come back to this question after this PR's subsequent steps are also done throughout the repo.

@lalitb
Copy link
Member

lalitb commented Mar 26, 2025

Just an alternative idea specifically for suppression logic in the synchronous case:

log/span -> concurrent_processor -> exporter
or
log/span -> simple_processor -> otlp (using reqwest_blocking)

Can we rely on a thread-local suppression flag to handle the logic, avoiding the need for context cloning?

@cijothomas
Copy link
Member Author

Just an alternative idea specifically for suppression logic in the synchronous case:

log/span -> concurrent_processor -> exporter or log/span -> simple_processor -> otlp (using reqwest_blocking)

Can we rely on a thread-local suppression flag to handle the logic, avoiding the need for context cloning?

As mentioned in #2821 (comment) , I don't think entering_suppression is a hot path operation to heavily perf optimize.

@lalitb
Copy link
Member

lalitb commented Mar 26, 2025

Just an alternative idea specifically for suppression logic in the synchronous case:
log/span -> concurrent_processor -> exporter or log/span -> simple_processor -> otlp (using reqwest_blocking)
Can we rely on a thread-local suppression flag to handle the logic, avoiding the need for context cloning?

As mentioned in #2821 (comment) , I don't think entering_suppression is a hot path operation to heavily perf optimize.

Got it. For,
log/span -> concurrent_processor -> exporter or
log/span -> simple_processor -> otlp (using reqwest_blocking)

Would it still be outside hot-path? Not a blocker, just curious how will it work. Also, once suppression is enabled, would there be situation where we need to switch it off?

@@ -2,6 +2,26 @@

## vNext

Added the ability to prevent recursive telemetry generation through new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add PR link as well.

@cijothomas
Copy link
Member Author

Just an alternative idea specifically for suppression logic in the synchronous case:
log/span -> concurrent_processor -> exporter or log/span -> simple_processor -> otlp (using reqwest_blocking)
Can we rely on a thread-local suppression flag to handle the logic, avoiding the need for context cloning?

As mentioned in #2821 (comment) , I don't think entering_suppression is a hot path operation to heavily perf optimize.

Got it. For, log/span -> concurrent_processor -> exporter or log/span -> simple_processor -> otlp (using reqwest_blocking)

Would it still be outside hot-path? Not a blocker, just curious how will it work. Also, once suppression is enabled, would there be situation where we need to switch it off?

Yes you are totally right! For ConcurrentProcessor, it'll be in hot-path but I don't plan to use it for them at all. (You may have seen many PRs from me in the contrib repo, where I try to make the exporters used with ConcurrentProcessor safe from re-entrancy, so we don't have to pay the cost!)

The benchmarks in this PR shows there is a 10ns cost to just attach a context, whether or not it has suppression flag in it. That cost is not something we should be paying for the re-entrant ones. That limits observability, but there are alternate ideas to address them.

I hope to uncover all these in the follow up PRs, where we put the suppression flag into actual use!

@cijothomas cijothomas merged commit 2971467 into open-telemetry:main Mar 26, 2025
24 checks passed
@cijothomas cijothomas deleted the cijothomas/context-suppress branch March 26, 2025 23:14
@cijothomas
Copy link
Member Author

Also, once suppression is enabled, would there be situation where we need to switch it off?

I don't see the need myself for the purposes inside this repo. But if a component author wants to, they can turn it off. There is a test that shows this too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants