Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce sampling package as reference implementation for OTEP 235 #29720

Merged
merged 42 commits into from
Jan 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
2741a32
Add comments, README, metadata.yaml relative to #24811
jmacd Dec 8, 2023
91f6909
add new package
jmacd Dec 8, 2023
2834241
crosslink
jmacd Dec 8, 2023
12530ec
doc
jmacd Dec 8, 2023
4a83262
more place
jmacd Dec 8, 2023
b4c9b78
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Dec 11, 2023
53ac105
changelog
jmacd Dec 11, 2023
910beac
no change here
jmacd Dec 11, 2023
5997a19
more linty
jmacd Dec 11, 2023
2e9ddeb
lintier than thou
jmacd Dec 11, 2023
1909040
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Dec 11, 2023
3de01cc
fix that
jmacd Dec 11, 2023
1307b1d
hmm
jmacd Dec 11, 2023
28ea1e4
delint
jmacd Dec 11, 2023
5661ced
do what it says
jmacd Dec 11, 2023
95ac7f0
Comment on why must() and mustNot() are here
jmacd Dec 19, 2023
47ddb07
more comments
jmacd Dec 19, 2023
0042984
more comment
jmacd Dec 19, 2023
b9513d8
un-export a few things
jmacd Dec 20, 2023
b33ceeb
document the API
jmacd Dec 20, 2023
0feb399
typos
jmacd Jan 10, 2024
9f4d577
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Jan 10, 2024
6204743
update pdata dep
jmacd Jan 10, 2024
ebc99ef
several testable examples
jmacd Jan 23, 2024
0302872
Apply suggestions from code review
jmacd Jan 23, 2024
be84226
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Jan 25, 2024
efb3992
move probability test
jmacd Jan 25, 2024
c76b345
small prob examples
jmacd Jan 25, 2024
dbb04d2
improve probability testing; add comments to the example tests for re…
jmacd Jan 25, 2024
8fc6004
more testing
jmacd Jan 26, 2024
554c351
doc comments
jmacd Jan 26, 2024
c8c196d
Add higher-level example in doc comment
jmacd Jan 26, 2024
e294124
test for no-error, fix test as called out by jpk
jmacd Jan 26, 2024
f48df39
Apply suggestions from code review
jmacd Jan 26, 2024
637a787
lcAlphanum
jmacd Jan 26, 2024
f4e4d7a
Apply suggestions from code review
jmacd Jan 26, 2024
c7c796b
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Jan 26, 2024
c565f3c
tidy
jmacd Jan 26, 2024
231c3e4
Update pkg/sampling/doc.go
jmacd Jan 30, 2024
896f4b8
Remove most of the Has methods
jmacd Jan 30, 2024
ba64e94
Merge branch 'jmacd/pkgsampl' of github.com:jmacd/opentelemetry-colle…
jmacd Jan 30, 2024
a3844ca
update/tidy
jmacd Jan 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/add_pkg_sampling.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: new_component

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg_sampling

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Package of code for parsing OpenTelemetry tracestate probability sampling fields.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [29738]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [api]
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ pkg/ottl/ @open-telemetry/collect
pkg/pdatatest/ @open-telemetry/collector-contrib-approvers @djaglowski @fatsheep9146
pkg/pdatautil/ @open-telemetry/collector-contrib-approvers @dmitryax
pkg/resourcetotelemetry/ @open-telemetry/collector-contrib-approvers @mx-psi
pkg/sampling/ @open-telemetry/collector-contrib-approvers @jmacd @kentquirk
pkg/stanza/ @open-telemetry/collector-contrib-approvers @djaglowski
pkg/translator/azure/ @open-telemetry/collector-contrib-approvers @open-telemetry/collector-approvers @atoulme @cparkins
pkg/translator/jaeger/ @open-telemetry/collector-contrib-approvers @open-telemetry/collector-approvers @frzifus
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ body:
- pkg/pdatatest
- pkg/pdatautil
- pkg/resourcetotelemetry
- pkg/sampling
- pkg/stanza
- pkg/translator/azure
- pkg/translator/jaeger
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/feature_request.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ body:
- pkg/pdatatest
- pkg/pdatautil
- pkg/resourcetotelemetry
- pkg/sampling
- pkg/stanza
- pkg/translator/azure
- pkg/translator/jaeger
Expand Down
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/other.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ body:
- pkg/pdatatest
- pkg/pdatautil
- pkg/resourcetotelemetry
- pkg/sampling
- pkg/stanza
- pkg/translator/azure
- pkg/translator/jaeger
Expand Down
1 change: 1 addition & 0 deletions pkg/sampling/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include ../../Makefile.Common
23 changes: 23 additions & 0 deletions pkg/sampling/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# pkg/sampling

## Overview

This package contains utilities for parsing and interpreting the W3C
[TraceState](https://www.w3.org/TR/trace-context/#tracestate-header)
and all sampling-relevant fields specified by OpenTelemetry that may
be found in the OpenTelemetry section of the W3C TraceState.

This package implements the draft specification in [OTEP
235](https://github.com/open-telemetry/oteps/pull/235), which
specifies two fields used by the OpenTelemetry consistent probability
sampling scheme.

These are:

- `th`: the Threshold used to determine whether a TraceID is sampled
- `rv`: an explicit randomness value, which overrides randomness in the TraceID

[OTEP 235](https://github.com/open-telemetry/oteps/pull/235) contains
details on how to interpret these fields. The are not meant to be
human readable, with a few exceptions. The tracestate entry `ot=th:0`
indicates 100% sampling.
125 changes: 125 additions & 0 deletions pkg/sampling/common.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package sampling // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/sampling"

import (
"errors"
"io"
"strings"

"go.uber.org/multierr"
)

// KV represents a key-value parsed from a section of the TraceState.
type KV struct {
Key string
Value string
}

var (
// ErrTraceStateSize is returned when a TraceState is over its
// size limit, as specified by W3C.
ErrTraceStateSize = errors.New("invalid tracestate size")
)

// keyValueScanner defines distinct scanner behaviors for lists of
// key-values.
type keyValueScanner struct {
// maxItems is 32 or -1
maxItems int
// trim is set if OWS (optional whitespace) should be removed
trim bool
// separator is , or ;
separator byte
// equality is = or :
equality byte
}

// commonTraceState is embedded in both W3C and OTel trace states.
type commonTraceState struct {
kvs []KV
}

// ExtraValues returns additional values are carried in this
// tracestate object (W3C or OpenTelemetry).
func (cts commonTraceState) ExtraValues() []KV {
return cts.kvs
}

// trimOws removes optional whitespace on both ends of a string.
// this uses the strict definition for optional whitespace tiven
// in https://www.w3.org/TR/trace-context/#tracestate-header-field-values
func trimOws(input string) string {
jmacd marked this conversation as resolved.
Show resolved Hide resolved
return strings.Trim(input, " \t")
}

// scanKeyValues is common code to scan either W3C or OTel tracestate
// entries, as parameterized in the keyValueScanner struct.
func (s keyValueScanner) scanKeyValues(input string, f func(key, value string) error) error {
var rval error
items := 0
for input != "" {
items++
if s.maxItems > 0 && items >= s.maxItems {
// W3C specifies max 32 entries, tested here
// instead of via the regexp.
return ErrTraceStateSize
}

sep := strings.IndexByte(input, s.separator)

var member string
if sep < 0 {
member = input
input = ""
} else {
member = input[:sep]
input = input[sep+1:]
}

if s.trim {
// Trim only required for W3C; OTel does not
// specify whitespace for its value encoding.
member = trimOws(member)
}

if member == "" {
// W3C allows empty list members.
continue
}

eq := strings.IndexByte(member, s.equality)
if eq < 0 {
// We expect to find the `s.equality`
// character in this string because we have
// already validated the whole input syntax
// before calling this parser. I.e., this can
// never happen, and if it did, the result
// would be to skip malformed entries.
continue
}
if err := f(member[:eq], member[eq+1:]); err != nil {
rval = multierr.Append(rval, err)
}
}
return rval
}

// serializer assists with checking and combining errors from
// (io.StringWriter).WriteString().
type serializer struct {
writer io.StringWriter
err error
}

// write handles errors from io.StringWriter.
func (ser *serializer) write(str string) {
_, err := ser.writer.WriteString(str)
ser.check(err)
}

// check handles errors (e.g., from another serializer).
func (ser *serializer) check(err error) {
ser.err = multierr.Append(ser.err, err)
}
89 changes: 89 additions & 0 deletions pkg/sampling/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

// # TraceState representation
//
// A [W3CTraceState] object parses and stores the OpenTelemetry
// tracestate field and any other fields that are present in the
// W3C tracestate header, part of the [W3C tracecontext specification].
//
// An [OpenTelemetryTraceState] object parses and stores fields of
// the OpenTelemetry-specific tracestate field, including those recognized
// for probability sampling and any other fields that are present. The
// syntax of the OpenTelemetry field is specified in [Tracestate handling].
//
// The probability sampling-specific fields used here are specified in
// [OTEP 235]. The principal named fields are:
//
// - T-value: The sampling rejection threshold, expresses a 56-bit
// hexadecimal number of traces that will be rejected by sampling.
// - R-value: The sampling randomness value can be implicit in a TraceID,
// otherwise it is explicitly encoded as an R-value.
//
// # Low-level types
//
// The three key data types implemented in this package represent sampling
// decisions.
//
// - [Threshold]: Represents an exact sampling probability.
// - [Randomness]: Randomness used for sampling decisions.
// - [Threshold.Probability]: a float64 in the range [MinSamplingProbability, 1.0].
//
// # Example use-case
//
// To configure a consistent tail sampler in an OpenTelemetry
// Collector using a fixed probability for all traces in an
// "equalizing" arrangement, where the effect of sampling is
// conditioned on how much sampling has already taken place, use the
// following pseudocode.
//
// func Setup() {
// // Get a fixed probability value from the configuration, in
// // the range (0, 1].
// probability := *FLAG_probability
//
// // Calculate the sampling threshold from probability using 3
// // hex digits of precision.
// fixedThreshold, err = ProbabilityToThresholdWithPrecision(probability, 3)
// if err != nil {
// // error case: Probability is not valid.
// }
// }
//
// func MakeDecision(tracestate string, tid TraceID) bool {
// // Parse the incoming tracestate
// ts, err := NewW3CTraceState(tracestate)
// if err != nil {
// // error case: Tracestate is ill-formed.
// }
// // For an absolute probability sample, we check the incoming
// // tracestate to see whether it was already sampled enough.
// if len(ts.OTelValue().TValue()) != 0 {
// // If the incoming tracestate was already sampled at
// // least as much as our threshold implies, then its
// // (rejection) threshold is higher. If so, then no
// // further sampling is called for.
// if ThresholdGreater(ts.OTelValue().TValueThreshold(), fixedThreshold) {
// return true
// }
// }
// var rnd Randomness
// // If the R-value is present, use it. If not, rely on TraceID
// // randomness. Note that OTLP v1.1.0 introduces a new Span flag
// // to convey trace randomness correctly, and if the context has
// // neither the randomness bit set or the R-value set, we need a
// // fallback, which can be to synthesize an R-value or to assume
// // the TraceID has sufficient randomness. This detail is left
// // out of scope.
// if rval, hasRval := ts.OTelValue().RValueRandomness(); hasRv {
// rnd = rval
// } else {
// rnd = TraceIDToRandomness(tid)
// }
//
// return fixedThreshold.ShouldSample(rnd)
// }
//
// [W3C tracecontext specification]: https://www.w3.org/TR/trace-context/#tracestate-header
// [Tracestate handling]: https://opentelemetry.io/docs/specs/otel/trace/tracestate-handling/
package sampling // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/sampling"
Loading
Loading