Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify named tracers and meters #76

Merged
Merged
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 52 additions & 19 deletions text/0016-named-tracers.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**Status:** `approved`

_Creating Tracers and Meters using a factory mechanism and naming those Tracers / Meters in accordance with the library that provides the instrumentation for those components._
_Associate Tracers and Meters with the name and version of the library which reports telemetry data by parameterizing the API which the library uses to acquire the Tracer or Meter._

## Suggested reading

Expand All @@ -12,43 +12,53 @@ _Creating Tracers and Meters using a factory mechanism and naming those Tracers

## Motivation

The mechanism of "Named Tracers and Meters" proposed here is motivated by following scenarios:
The mechanism of "Named Tracers and Meters" proposed here is motivated by the following scenarios:

* For a consumer of OpenTelemetry instrumentation libraries, there is currently no possibility of influencing the amount of the data produced by such libraries. Instrumentation libraries can easily "spam" backend systems, deliver bogus data or - in the worst case - crash or slow down applications. These problems might even occur suddenly in production environments caused by external factors such as increasing load or unexpected input data.
### Faulty or expensive instrumentation

* If a library hasn't implemented [semantic conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-semantic-conventions.md) correctly or those conventions change over time, it's currently hard to interpret and sanitize these data selectively. The produced Spans or Metrics cannot be associated with those instrumentation libraries later.
For an operator of an application using OpenTelemetry, there is currently no way to influence the amount of data produced by reporting libraries. Reporting libraries can easily "spam" backend systems, deliver bogus data, or -- in the worst case -- crash or slow down applications. These problems might even occur suddenly in production environments because of external factors such as increasing load or unexpected input data.

### Reporting library identification

If a reporting library hasn't implemented [semantic conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-semantic-conventions.md) correctly or those conventions change over time, it's currently hard to interpret and sanitize data produced by it selectively. The produced Spans or Metrics cannot later be associated with the library which reported them, either in the processing pipeline or the backend.

### Disable instrumentation of pre-instrumented libraries

It is the eventual goal of OpenTelemetry that library vendors implement the OpenTelemetry API, obviating the need to auto-instrument their library. An operator should be able to disable the telemetry that is built into some database driver or other library and provide their own integration if the built-in telemetry is lacking in some way. This should be possible even if the developer of that database driver has not provided a configuration to disable telemetry.

## Solution

This proposal attempts to solve the stated problems by introducing the concept of:
* _Named Tracers and Meters_ identified via a **name** (e.g. _"io.opentelemetry.contrib.mongodb"_) and a **version** (e.g._"semver:1.0.0"_) which is associated with the Tracer / Meter and the Spans / Metrics it produces.
* A `TracerFactory` / `MeterFactory` as the only means of creating a Tracer or Meter.
* _Named Tracers and Meters_ which are associated with the **name** (e.g. _"io.opentelemetry.contrib.mongodb"_) and **version** (e.g._"semver:1.0.0"_) of the library which acquired them.
* A `TracerProvider` / `MeterProvider` as the only means of acquiring a Tracer or Meter.

Based on such an identifier, a Sampler could be implemented that discards Spans or Metrics from certain libraries. Also, by providing custom Exporters, Span or Metric data could be sanitized before it gets processed in a back-end system. However, this is beyond the scope of this proposal, which only provides the fundamental mechanisms.
Based on the name and version, a Provider could provide a no-op Tracer or Meter to specific instrumentation libraries, or a Sampler could be implemented that discards Spans or Metrics from certain libraries. Also, by providing custom Exporters, Span or Metric data could be sanitized before it gets processed in a back-end system. However, this is beyond the scope of this proposal, which only provides the fundamental mechanisms.

## Explanation

From a user perspective, working with *Named Tracers / Meters* and `TracerFactory` / `MeterFactory` is conceptually similar to how e.g. the [Java logging API](https://docs.oracle.com/javase/7/docs/api/java/util/logging/Logger.html#getLogger(java.lang.String)) and logging frameworks like [log4j](https://www.slf4j.org/apidocs/org/slf4j/LoggerFactory.html) work. In analogy to requesting Logger objects through LoggerFactories, a tracing library would create specific Tracer / Meter objects through a TracerFactory / MeterFactory.
From a user perspective, working with *Named Tracers / Meters* and `TracerProvider` / `MeterProvider` is conceptually similar to how e.g. the [Java logging API](https://docs.oracle.com/javase/7/docs/api/java/util/logging/Logger.html#getLogger(java.lang.String)) and logging frameworks like [log4j](https://www.slf4j.org/apidocs/org/slf4j/LoggerFactory.html) work. In analogy to requesting Logger objects through LoggerFactories, a trace reporting would create specific Tracer / Meter objects through a TracerProvider / MeterProvider.

New Tracers or Meters can be created by providing the name and version of an instrumentation library. The version (following the convention proposed in https://github.com/open-telemetry/oteps/pull/38) is basically optional but *should* be supplied since only this information enables following scenarios:
* Only a specific range of versions of a given instrumentation library need to be suppressed, while other versions are allowed (e.g. due to a bug in those specific versions).
* Go modules allow multiple versions of the same middleware in a single build so those need to be determined at runtime.

```java
// Create a tracer/meter for a given instrumentation library in a specific version.
Tracer tracer = OpenTelemetry.getTracerFactory().getTracer("io.opentelemetry.contrib.mongodb", "semver:1.0.0");
Meter meter = OpenTelemetry.getMeterFactory().getMeter("io.opentelemetry.contrib.mongodb", "semver:1.0.0");
Tracer tracer = OpenTelemetry.getTracerProvider().getTracer("io.opentelemetry.contrib.mongodb", "semver:1.0.0");
Meter meter = OpenTelemetry.getMeterProvider().getMeter("io.opentelemetry.contrib.mongodb", "semver:1.0.0");
```

These factories (`TracerFactory` and `MeterFactory`) replace the global `Tracer` / `Meter` singleton objects as ubiquitous points to request Tracer and Meter instances.
These factories (`TracerProvider` and `MeterProvider`) replace the global `Tracer` / `Meter` singleton objects as ubiquitous points to request Tracer and Meter instances.

The *name* used to create a Tracer or Meter must identify the *instrumentation* libraries (also referred to as *integrations*) and not the instrumented libraries. These instrumentation libraries could be libraries developed in an OpenTelemetry repository, a 3rd party implementation or even auto-injected code (see [Open Telemetry Without Manual Instrumentation OTEP](https://github.com/open-telemetry/oteps/blob/master/text/0001-telemetry-without-manual-instrumentation.md)). See also the examples for identifiers at the end.
If a library (or application) has instrumentation built-in, it is both the instrumenting and instrumented library and should pass its own name here. In all other cases (and to distinguish them from that case), the distinction between instrumenting and instrumented library is very important. For example, if an HTTP library `com.example.http` is instrumented by either `io.opentelemetry.contrib.examplehttp` or `com.example.company.examplehttpintegration`, then it is important that the Tracer is not named `com.example.http` but after the actual instrumentation library.
The *name* used to create a Tracer or Meter must identify the *instrumentation* libraries (also referred to as *integrations* or *trace/meter reporting library*) and not the library being instrumented. These instrumentation libraries could be libraries developed in an OpenTelemetry repository, a 3rd party implementation, or even auto-injected code (see [Open Telemetry Without Manual Instrumentation OTEP](https://github.com/open-telemetry/oteps/blob/master/text/0001-telemetry-without-manual-instrumentation.md)). See also the examples for identifiers at the end.
dyladan marked this conversation as resolved.
Show resolved Hide resolved
If a library (or application) has instrumentation built-in, it is both the instrumenting and instrumented library and should pass its own name here. In all other cases (and to distinguish them from that case), the distinction between instrumenting and instrumented library is very important. For example, if an HTTP library `com.example.http` is instrumented by either `io.opentelemetry.contrib.examplehttp`, then it is important that the Tracer is not named `com.example.http`, but `io.opentelemetry.contrib.examplehttp` after the actual instrumentation library.
dyladan marked this conversation as resolved.
Show resolved Hide resolved

If no name (null or empty string) is specified, following the suggestions in ["error handling proposal"](https://github.com/open-telemetry/opentelemetry-specification/pull/153), a "smart default" will be applied and a default Tracer / Meter implementation is returned.

### Examples (of Tracer and Meter names)

Since Tracer and Meter names describe the libraries which use those Tracers and Meters, their names should be defined in a way that makes them as unique as possible.
The name of the Tracer / Meter should represent the identity of the library, class or package that provides the instrumentation.
The name of the Tracer / Meter should represent the identity of the library, class or package that provides the instrumentation.
Copy link
Member

@bogdandrutu bogdandrutu Jan 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small clarification, is the name the key that identifies the Tracer/Meter or name + version? What is the expected behavior when (I don't know how is it possible) there are 2 requests to the getTracer with same name and different versions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently unspecified. In the javascript version, you get a different instance of tracer per name-version pair, but you could conceivably get a new tracer every time getTracer is called. Do you think that needs to be specified? Since the tracer delegates all its configuration to the TracerProvider, it doesn't really matter if you make a new one per name, new one per name/version pair, new one every time, or just 1 instance of each tracer type (noop or working).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is to work on this detail in the actual Specification, as this OTEP has been approved and is ready to merge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important mainly for the metrics side. One important thing we need to ensure the name of the metrics are unique inside a "component" (instrumentation library), so if the component is identified by the name then I need to keep a list of all the registered instruments at the name level vs at the name+version level.


Examples (based on existing contribution libraries from OpenTracing and OpenCensus):

Expand All @@ -66,22 +76,46 @@ Examples (based on existing contribution libraries from OpenTracing and OpenCens

## Internal details

By providing a `TracerFactory` / `MeterFactory` and *Named Tracers / Meters*, a vendor or OpenTelemetry implementation gains more flexibility in providing Tracers and Meters and which attributes they set in the resulting Spans and Metrics that are produced.
By providing a `TracerProvider` / `MeterProvider` and *Named Tracers / Meters*, a vendor or OpenTelemetry implementation gains more flexibility in providing Tracers and Meters and which attributes they set in the resulting Spans and Metrics that are produced.

On an SDK level, the SpanData class and its Metrics counterpart are extended with a `getLibraryResource` function that returns the resource associated with the Tracer / Meter that created it.

## Glossary of Terms

#### Reporting library
Also known as the instrumentation library or the trace/metrics reporter, this may be either a library/module/plugin provided by OpenTelemetry that instruments an existing library, a third party integration which instruments some library, or a library that has implemented the OpenTelemetry API in order to instrument itself. In any case, the reporting library is the library which provides tracing and metrics data to OpenTelemetry.

examples:
* `@opentelemetry/plugin-http`
* `io.opentelemetry.redis`
* `redis-client` (in this case, `redis-client` has instrumented itself with the OpenTelemetry API)

#### Tracer name / Meter name
When it is acquired from the tracer/meter Provider, the tracer/meter is assigned a name and a version which is the name and version of the reporting library that acquired the tracer. In cases where a library has instrumented itself using the OpenTelemetry API, they may be the same.
dyladan marked this conversation as resolved.
Show resolved Hide resolved

example: If the `http` library is being instrumented by a library with the name `io.opentelemetry.contrib.http`, then the tracer name is also `io.opentelemetry.contrib.http`. If that same `http` library has built-in instrumentation through use of the OpenTelemetry API, then the tracer name would be `http`.

### Disambiguation with related terms

#### Meter namespace
This is included here because it is often confused with the meter name.

A Meter namespace describes the component which is being monitored. For instance, `latency` could be collected from a number of different things (http latency, disk latency, event loop latency) and a namespace ties the `latency` metric to the entity being metered.

example: `bytes_transmitted` may be overall network bytes transmitted, or bytes transmitted specifically over `http`. In this case, it may be desirable to disambiguate these metrics by applying the `http` or `network` namespaces to the `bytes_transmitted` metric.

## Prior art and alternatives

This proposal originates from an `opentelemetry-specification` proposal on [components](https://github.com/open-telemetry/opentelemetry-specification/issues/10) since having a concept of named Tracers would automatically enable determining this semantic `component` property.

Alternatively, instead of having a `TracerFactory`, existing (global) Tracers could return additional indirection objects (called e.g. `TraceComponent`), which would be able to produce spans for specifically named traced components.
Alternatively, instead of having a `TracerProvider`, existing (global) Tracers could return additional indirection objects (called e.g. `TraceComponent`), which would be able to produce spans for specifically named traced components.

```java
TraceComponent traceComponent = OpenTelemetry.Tracing.getTracer().componentBuilder("io.opentelemetry.contrib.mongodb", "semver:1.0.0");
Span span = traceComponent.spanBuilder("someMethod").startSpan();
```

Overall, this would not change a lot compared to the `TracerFactory` since the levels of indirection until producing an actual span are the same.
Overall, this would not change a lot compared to the `TracerProvider` since the levels of indirection until producing an actual span are the same.

Instead of setting the `component` property based on the given Tracer names, those names could also be used as *prefixes* for produced span names (e.g. `<TracerName-SpanName>`). However, with regard to data quality and semantic conventions, a dedicated `component` set on spans is probably preferred.

Expand All @@ -94,7 +128,7 @@ libraryLabels.put("name", "io.opentelemetry.contrib.mongodb");
libraryLabels.put("version", "1.0.0");
Resource libraryResource = Resource.create(libraryLabels);
// Create tracer for given instrumentation library.
Tracer tracer = OpenTelemetry.getTracerFactory().getTracer(libraryResource);
Tracer tracer = OpenTelemetry.getTracerProvider().getTracer(libraryResource);
```

Those given alternatives could be applied to Meters and Metrics in the same way.
Expand All @@ -104,4 +138,3 @@ Those given alternatives could be applied to Meters and Metrics in the same way.
Based on the Resource information identifying a Tracer or Meter these could be configured (enabled / disabled) programmatically or via external configuration sources (e.g. environment).

Based on this proposal, future "signal producers" (i.e. logs) can use the same or a similar creation approach.