Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeSet in the OpenTelemetry API #1508

Closed
cijothomas opened this issue Feb 2, 2024 · 4 comments · Fixed by #1512
Closed

AttributeSet in the OpenTelemetry API #1508

cijothomas opened this issue Feb 2, 2024 · 4 comments · Fixed by #1512

Comments

@cijothomas
Copy link
Member

#1421 allowed pre-creation of AttributeSet for scenarios requiring high performance. This also moved AttributeSet to the opentelemetry api. This has an undesired/unintended side effect: The entire cost of creating AttributeSetis now paid in the API itself (like sort/hash/de-duplication), making it impossible to do optimizations in the SDK, as the perf hit is already paid in the API itself.

This violates the core OTel requirement that the overhead of using the OTel API should be zero/close-to-zero, without an SDK configured. Because right now, even without an SDK configured, the OTel API does the AttributeSet creation which is a far from the zero/close-to-zero overhead. i.e the NoOp implementation is broken.

This is similar to #1189, but this is much more serious due to the perf hit.

Need to find a way to achieve the intended perf gains, but yet move off entire AttributeSet creation to the actual SDK. Will discuss some ideas in the next community meeting.

cc : @lalitb @KallDrexx

@KallDrexx
Copy link
Contributor

This feels like an arbitrary requirement to me. How many real users are

  1. Using opentelemetry-rust with the No-op implementation
  2. Using it in a performance critical system where the impact is noticeable.

Is this just a theoretical concern or has this really been a legitimate issue in the past?

We could make the AttributeSet lazily create the InnerAttributeSet only when actually called upon. This would require either a Mutex or RwLock to ensure we can handle concurrent access to the attribute set, which would incur extra costs. It will also require some complex lifetime management, because we can no longer consume the Into<AttributeSet> iterator immediately, and thus a &[KeyValue] needs to live long enough until the instantiation.

We can remove precached attributes and only support bounded instruments. This means you still pay the cost of creation when a single attributeset can be used for multiple instruments. This isn't theoretical, for example every packet we receive uses the same attribute set for the "bytes received" counter and the "packet count" counter. This means you can't do an LRU cache of attribute sets, so if your web server handles a request from a tenant it can cache tenant specific attribute sets assuming the load balancer will probably send another request for that tenant to the server again.

We could make AttributeSet a trait with the current implementation in the SDK. This would require annotating everything with a second generic trait, which may be complex with a lot of the dyn Any and other dynamic calls there are, at least without extra runtime costs. It also has complexity in passing counters around, because the generic parameter can't be inferred in many circumstances due to it being used in a different area of code than its creation or the measure call.

@KallDrexx
Copy link
Contributor

After talking with Cijo a bit, we came to the conclusion to revert the precached attribute set work, pause the bound instruments API, and add the concept of instrument level attributes at instrument (e.g. counter) creation time. I'll investigate what that will look like.

@cijothomas
Copy link
Member Author

Adding some more context:

  1. To the question if this is a real requirement or not -- Yes this is a core principle of OTel. NoOp APIs allow library authors to bet on OTel API without worrying that their library users will pay perf cost even if they are not ultimately using OTel. The biggest ask from library authors trying to natively instrument their libraries are about "how much cost will be paid, if my end users does not enable telemetry".
  2. Also, ability to swap out official SDK with a custom one, is a requirement - that, though rare, allows folks to write custom sdks meeting perf needs not normally possible with official ones. But this can only be feasible if API delegates every work to the SDK. (if we use tracing as an example - the APIs are near zero cost in the absence of a listener, and listeners can swapped out based on perf needs. This principle is same in Otel, though terminology used is different.)
  3. Making AttributeSet a trait in API and allowing SDK to provide implementation is a feasible route - need to benchmark and see if the dynamic dispatch cost is acceptable. Until we do this, it is better to revert the AttributeSet changes.

The idea of adding attributes at Instrument creation time, and then providing nothing more at the measurement reporting time, can allow the SDK to offer a very fast Metrics SDK, for those scenarios where the attributes are known ahead of time (as in the case @KallDrexx is primarily after.) This is similar to bound-instrument concept, but there is no need of strictly providing a bound-instrument API, as it is possible to achieve this with instrument level attributes.

counter = meter.CreateCounterWithAttributes(name,descp, attributes);
counter.Add(1); // note - nothing is being passed as attributes, though it is feasible to do so.

In the SDK, counter.Add(1) (measurements with no attributes) can be implemented very efficiently as there is no need to spend time on attributes (which we know are a huge overhead), and there is no need of hashmap lookup (as we can special case the 0-attribute case).

This does not prohibit future re-addition of AttributeSet or addition of a more formal Bound instrument idea - just starting with the above, so as to unblock some high-perf scenarios. (A significant amount of efforts already went into AttributeSet/Bound - we hope they are not totally wasted. we should be able to leverage them in the future)

@cijothomas
Copy link
Member Author

SIG call:

  1. Revert the PR before conflicts start appear. -- Matthew will send PR
  2. Meter (instrumentation scope attributes) or instrumentation attributes needs to be explored to see if it can indeed achieve the high perf.
    * Might need special casing 0 attributes case. - Matt/Cijo will explore this, and send PRs.
    * View might complicate things, but maybe not.. need to explore. - Cijo will explore this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants