Non-power-of-two consistent tail probability sampling in TraceState #226

jmacd · 2023-03-02T08:46:35Z

This is meant to address open-telemetry/opentelemetry-specification#1413.
Follows https://github.com/open-telemetry/oteps/blob/main/text/trace/0170-sampling-probability.md and https://github.com/open-telemetry/oteps/blob/main/text/trace/0168-sampling-propagation.md.

Cc: @oertl @kalyanaj @PeterF778

text/trace/0226-sampling-random-traceids.md

jmacd · 2023-03-03T17:24:49Z

@oertl Thank you. I accept your suggestions and wonder, would you be interested in drafting blocks of replacement text for your scheme? I'll be glad to work out the pseudocode snippets from there, but your words for "hex/binary threshold equals number of spans dropped out of 2^56", for explaining the 52-bits vs 56 bits issue, and the use of greater-or-equal, I take it, for filtering. I would be glad to then work on writing a spec to deprecate p-value in favor of t-value, where as you say the threshold "0" equals "00000000000000" indicating all spans with 7-bytes >= "00000000000000".

text/trace/0226-sampling-random-traceids.md

oertl · 2023-03-06T07:24:32Z

As alternative to the t-value, the p-value definition could be extended. For example, p:11.a3d could be interpreted as 11 leading zeros followed by a 1 and the bits defined by the extra hex code "a3d". Remaining bits are filled by ones. In this case, the value would be

00000000000    1  1010      0011      1101       111111111111111111111111111111111111     = 0x001a3dffffffff = 28853590294527 

11 leading        hex "a"   hex "3"   hex "d"    32 one bits  to fill up to 56 bits    
zeros

The resulting value corresponds be the number of kept spans out of 2^56 subtracted by 1.

The example above would correspond to a sampling probability of (28853590294527 + 1)/2^56 = 0.000400424

With this definition p:0, p:1, p:2, etc. would correspond to power of two sampling probabilities as already defined.

jmacd · 2023-03-06T18:50:36Z

p:11.a3d

By the way, in my prototype I used the syntax a3d-11 where you've written 11.a3d (b/c I read this as "a3d shifted 11"). In this draft, to simplify, I proposed to encode the 11 zeros (i.e., 00000000000a3d), but I agree that encoding the zero count separately is more compact for most probabilities.

As for whether we overload how to parse p of not, that's a separate question-- it would break existing uses of p-value. That's why I prefer to create a new variable, but since p is marked experimental I'm open to this change.

kalyanaj · 2023-03-09T02:15:00Z

text/trace/0226-sampling-random-traceids.md

+
+This proposes to extend that specification with support for 56-bit
+precision sampling probability.  This is seen as particularly
+important for implementation of probabilistic tail samplers (e.g., in


It may be good to elaborate a bit more on the motivation for this requirement of higher precision sampling probability.

text/trace/0226-sampling-random-traceids.md

kalyanaj · 2023-03-09T02:49:53Z

fer to create a new variable, but since p is marked experimental I'm open to this change

If p-value is not used currently in any implementations (or if they are okay with a breaking change since it is still experimental as you call out), yes it looks like extending p-value to also encode non-powers-of-two sampling probabilities is a good idea by @oertl.

Reasoning: Since p-value and t-value are conceptually for the same purpose (to encode sampling probabilities), ideally it will be good to just extend the p-value concept - it will make it a bit easier to understand for folks who are already familiar with the purpose of p-value, rather than trying to think of t-value as a new concept.

PeterF778 · 2023-03-09T03:33:48Z

With respect to obsoleting the r-value, I reported a new issue 3307 today. If we agree that the issue is real (I'm not entirely sure), the solution I proposed would make consistent probability samplers work differently if there's no r-value.

jmacd · 2023-04-21T16:51:09Z

This was discussed in yesterday's Sampling SIG. Since the initial feedback, I had come to the following rough idea to use all the bits of the TraceID, to consistently decide how to interpolate between powers of two.

Considering an 8-bit example with 75% sampling:

t-value indicates the trace ID at which drops take place
75% sampling corresponds with t=0x40, (i.e., decimal 64) indicating to drop 64 out of 256 traces
t=0x40 corresponds with adjusted count of 4/3, calculated as 0x100 / (0x100 - 0x40)
t=0x40 also can be deterministically mapped to p=0 or p=1 using trace-id randomness to interpolate, in the example there are 192 out of 256 traces being accepted, they just need to be split into p=0 and p=1 to preserve the expected value of adjusted count. A similar/related formula is needed to interpolate by a head sampler, which I realize we never published in the specification (an omission, I think, see where I implemented this function in the head sampler)

cc @oertl @PeterF778

oertl · 2023-04-22T19:25:00Z

@jmacd

This was discussed in yesterday's Sampling SIG. Since the initial feedback, I had come to the following rough idea to use all the bits of the TraceID, to consistently decide how to interpolate between powers of two.

Unfortunately, after rethinking this proposal, I have concluded that it would not allow correct estimation of trace quantities (e.g. estimating the number of traces touching one service A and another service B) as described in my paper. It would only work for span estimates (e.g. estimating the number of spans of service A). The proposal violates the basic assumption that the choice of the sampling probability is independent of the shared randomness (trace ID or p-value).

PeterF778 · 2023-04-25T22:53:20Z

Let's take one step back.

The Consistent Probability Sampling schema already allows consistent head and tail sampling if the sampling probability is a power of 2. Now we want to extend it so that consistency (head and tail) is preserved even for non-power-of-2 probabilities.

It is obvious, I think, that we need to use the random bits of trace-id somehow, if we want to get consistent sampling with any given probability, meaning multiple instances of (head or tail) samplers making the same sampling decisions wrt spans belonging to the same trace. From the past discussions, it looks like we have two categories of possible extensions:

A. Allow (approximation of) any non-power-of-2 sampling probabilities in TraceState.

B. Continue to restrict the sampling probabilities in TraceState to power-of-2 values.

and we can always have

C. Drop Consistent Probability Sampling as it has been proposed and replace it with something else.

Now we could try to summarize the benefits and disadvantages of these approaches.
The way I see it, the benefits of A are:

with sampling probabilities like 75% or 5%, the metric estimates (span-to-metric pipeline) are a bit better (a smaller expected error) than when switching between power-of-2 probabilities
makes some folks happy that they can specify any probability for a single span

while the potential issues are:

possible non-integer adjusted counts
subsequent sampling steps (head+tail, for example) for the same span force us to multiply probabilities - this puts pressure on providing more precision to represent probabilities, or entails loss of precision
need to deal with a potentially large number of sampling probabilities, complicating metric calculation
longer strings required to encode probability encoding in TraceState (if we care about such things)

Respectively, approach B is the reverse of that, with the drawbacks being:

slightly larger expected error for metric estimates if the requested sampling rate was not a power of 2

and the advantages are:

adjusted counts are full integer numbers
sets of spans/traces can be re-sampled multiple times using the same mechanism and still encode the resulting adjusted count accurately
small number of possible span probabilities make processing easier

With respect to C, it is a very wide open field, but let's keep in mind that for expressing sampling probabilities we do not need high precision, we need wide range of values. That's why the logarithmic scale of r-values and p-values is so powerful and efficient.

If we try to replace that with a linear model based on 56 random bits in trace-id, we quickly run out of bits.

kalyanaj · 2023-05-04T04:27:45Z

Thanks @PeterF778 for consolidating the tradeoffs - overall the summary of the tradeoffs makes sense to me.

I didn't quite understand the below two points - can you please clarify/elaborate?

sets of spans/traces can be re-sampled multiple times using the same mechanism and still encode the resulting adjusted count accurately

and this final point about running out of bits:

If we try to replace that with a linear model based on 56 random bits in trace-id, we quickly run out of bits.

jmacd · 2023-05-04T14:55:05Z

03f693c Contains an update based on recent discussions.

jmacd · 2023-05-11T00:06:05Z

@kentquirk new draft:
df6b1d0

PeterF778 · 2023-05-11T01:21:13Z

A very nice and clean proposal!
A few questions though: it says it is extending the r-value and p-value proposal - the p-value will be used to calculate the final adjusted count, but is the r-value used anywhere?
And an old question remains: why tail sampling requires explicit non-power-of-two sampling probabilities as opposed to (deterministic) interpolation between two adjacent power-of-twos? What do we gain or lose with either approach?

text/trace/0226-sampling-random-traceids.md

kalyanaj · 2023-05-15T18:16:23Z

text/trace/0226-sampling-random-traceids.md

+user's intended sampling probability without floating point conversion
+loss.
+
+## Prior art and alternatives


Towards the end, we may want to call out that one benefit of the r-value based randomness was that it could be used to get consistent sampling across multiple traces (e.g., all traces started within a time window by a participant) - it would be good to call out that it should be possible to support it in the future as a complement to the current proposal.

If we decide to use arbitrary sampling probabilities, we should not use the current definition of the r-value. It makes no sense to have different discretizations for the r-value (powers of two) and for the t-value (56-bit values). Therefore, the r-value should rather be a 14-digit hex value that overrides the random bits of the trace ID, if present. This way we could also handle traces where the random flag is not set in the trace context. If the flag is not set and there is also no r-value, we could require consistent samplers to set the r-value by generating a 56-bit random value.

kalyanaj · 2023-05-15T18:23:22Z

text/trace/0226-sampling-random-traceids.md

+  Sampler's threshold, the span passes through with the current
+  sampler's t-value, otherwise the span is discarded.
+
+## Examples


It would be good to add two more examples that shows how consistent probability sampling can be achieved across multiple participants.

Example 1:

Upstream participant samples at 10% probability (ot=t:0.1 is sent as part of tracestate)

Downstream participant does parent-based sampling. It uses the sampled flag to make the decision, gets the t-value from the parent context and emits it as part of its context (ot=t:0.1 is sent as part of tracestate to further downstream participants)

Example 2:

Upstream participant samples at 10% probability (ot=t:0.1 is sent as part of tracestate)

Downstream participant samples at 5% probability - it calculates a threshold based on its sampling rate and compares with the traceID last 7 bytes to make the sampling decision (ot=t:20 is sent as part of tracestate).

Downstream participant does parent-based sampling (uses the sampled flag to make the decision, gets the t-value from the parent context and emits it as part of its context)

These examples sound good to me! Will do.

kalyanaj · 2023-05-15T18:25:27Z

text/trace/0226-sampling-random-traceids.md

+Traces whose least-significant 56 bits form an unsigned value less
+than 7205759403792794.
+
+## T-value encoding for adjusted counts


It will be good to define the mutation rules and propagation rules for t-value. E.g., something on the lines of:

if a participant is doing parent-based sampling, it should propagate the t-value from its parent.

if a participant is doing consistent probability sampling using its own sampling rate, it should mutate the t-value to set the new adjusted count / sampling rate.

Not quite answering your question, but I've prototyped open-telemetry/opentelemetry-collector-contrib#22058 with a different sort of answer to your question.

In this case referring to span data records, where there are multiple collectors in a pipeline. The first collector may sample at 1/10; when a subsequent collector samples at 1/20, the t-value of the selected spans will be updated. If the subsequent collector samples at 1/2, however, it is being less selective than the first collector, so it should not modify the t-value. That is to say that t-value adjusted counts should not fall and t-valued probabilities should not rise.

See the logic here: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/22058/files#diff-33f10350e2875f926dd2be6fc4c6bb88cfd8043cf6ac6d100295cf654771d90dR210-R219

I think there's a problem with such sampling behavior. Let's assume that the previous collector in chain sampled all traces with errors with probability 1, and all remaining traces with 1/100. If the next collector in chain is configured with 1/10, it will not touch the healthy traces, but will decimate the traces with errors. So any stratified sampling logic must be known and repeated by all collectors in the pipeline. Even if we prohibit stratified sampling, to set up a collector sampling probability in any meaningful way we have to know the minimum sampling probability of all the preceding collectors.

PeterF778 · 2023-05-15T20:23:47Z

I would like to see how to provide algorithms for tail sampling based on the t-value, for the following use cases:

(static sampling) Given a set of N1 spans, I want to re-sample it to arrive at approximately N2 spans.
(dynamic sampling) Given a stream of incoming spans, I want to re-sample it so I get at most X spans/s (approximately).

The potential challenge I see is that the algorithms may require multiplying t-values and/or sorting the spans according to their t-value.

oertl · 2023-05-16T08:18:08Z

@PeterF778

I would like to see how to provide algorithms for tail sampling based on the t-value, for the following use cases:

(static sampling) Given a set of N1 spans, I want to re-sample it to arrive at approximately N2 spans.

I believe collecting N2 spans with smallest random value can be done in O(N1) time and O(N2) space without sorting. The idea is to have a threshold T (initial value 1) and a buffer of size N2*(1 + eps) with some constant eps > 0 that stores all spans with random values (interpreted as values from [0,1)) smaller than T. If the buffer is full, the span with the (N2+1)-th largest random value is chosen using Quickselect . This random value defines the new threshold T which has an expected value of E(T) = 1/(1 + eps). All spans with random values >= T get dropped. The remaining N2 spans are kept and put into the first part of the buffer. This compaction step takes O(N2*(1+eps)) time. After that, the buffer has again space for N2*eps new spans. We fill that with the next spans in the data stream having a random value smaller than the threshold T. Since the expected value of the threshold is 1/(1 + eps), it takes N2*eps*(1+eps) spans on average to fill the buffer again, when the next compaction takes place. The new threshold will have an expected value of E(T) = 1/(1+eps)^2. And so on.

Therefore we need

O(N2*(1+eps)) time to process the first N2*(1 + eps) spans
O(N2*(1+eps)) time to process the next N2*eps*(1+eps) spans
O(N2*(1+eps)) time to process the fnext N2*eps*(1+eps)^2 spans
...

In summary, it takes O(K*N2*(1+eps)) time to process

N2*(1 + eps) + N2*eps*(1+eps) + N2*eps*(1+eps)^2 + ... + N2*eps*(1+eps)^(K-1) = N2*(1+eps)^K

spans. The average processing costs per span are therefore

O(K*N2*(1+eps) / N2*(1+eps)^K) = O(K / (1+eps)^(K-1)) = O(1)

which is constant. I am not 100% sure yet what t-value needs to be assigned to the surviving spans to get unbiased counts. Probably it is sufficent to replace the current t-value, if it is larger, by the threshold T. This Adaptive Threshold Sampling
paper might be helpful in this context.

(dynamic sampling) Given a stream of incoming spans, I want to re-sample it so I get at most X spans/s (approximately).

If there was no previous sampling stage, we need to estimate the incoming rate from spans in the past (e.g., with exponential smoothing). Using this estimate and the desired rate, we can calculate a sampling probability that is used as the threshold. If there were previous sampling stages, the random values are no longer uniformly distributed and the actual distribution must be considered when choosing the threshold. For example, to achieve a 50% reduction, the threshold would need to be set to the median of all random values. To obtain a sampling probability of p, one would have to set the threshold to the p-quantile. It might be possible to use techniques like Frugal Streaming to incrementally adjust the threshold.

PeterF778 · 2023-05-16T18:17:20Z

I'm not sure I understood your algorithm for static sampling correctly, @oertl. Correct me, if I'm wrong, but it looks like the selection of spans to survive is affected heavily by their t-value, rather than the source of randomness (trace-id), which introduces a bias.
Suppose that the original set of N1 spans contains a subset S of spans that have been already heavily sampled, so their t-values are very low - much lower than the spans not belonging to set S. In that case, if the size of set S is smaller than N2, all elements of set S will survive the re-sampling process. IMHO, an unbiased algorithm should sample spans from set S roughly with the same chances of survival as those from outside set S.

oertl · 2023-05-16T19:25:37Z

IMHO, an unbiased algorithm should sample spans from set S roughly with the same chances of survival as those from outside set S.

It depends on what is meant by bias. My understanding is the following: The expected adjusted count is equal to 1. This is ensured by setting the adjusted count to the inverse of the sampling probability. This should be the case with the algorithm described above.

@PeterF778, I think your concern is that the algorithm balances the sampling probabilities of the previous sampling stages. Depending on what you want to estimate, this sampling strategy may or may not be beneficial. I haven't checked, but I think the algorithm is similar to VarOpt sampling (see here), which minimizes variance when estimating arbitrary subset sums. This algorithm makes sense if you have a sampling stage that combines samples collected in earlier sampling stages with different probabilities for no good reason, e.g., due to unbalanced load or short-term load fluctuations.

The situation is different if you have already identified certain classes of spans (e.g., spans with errors) that should be sampled with a higher probability. In this case, you want to sample these spans more frequently than others by purpose. I think the correct term for this is stratified sampling, where weights are defined for different classes of spans and the sampling algorithm tries to sample them so that the ratios of the weights are reflected by the corresponding ratios of the t-values. To complicate matters further, different stages of sampling would define different weights for the same span. Early stages may not be able to assess the importance of a span while later sampling stages have a more holistic view of the trace and therefore might assign a different weight to a span. For example, if the child span has an error.

Anyway, discussion of these sampling strategies takes us a bit off topic. The same issues must be solved when using power-of-two sampling probabilities.

PeterF778 · 2023-05-17T05:05:44Z

@PeterF778, I think your concern is that the algorithm balances the sampling probabilities of the previous sampling stages. Depending on what you want to estimate, this sampling strategy may or may not be beneficial.

Yes, I imagine "static sampling" to be applied to aging data, after one or many stratified sampling steps were performed .

Anyway, discussion of these sampling strategies takes us a bit off topic. The same issues must be solved when using power-of-two sampling probabilities.

If we want to compare the two competing consistent probability sampling mechanisms, we have to understand what they will entail in all processing stages. I believe the re-sampling algorithms will be very similar in principle, but there could be differences in complexity or accuracy of the results.

oertl · 2023-05-17T06:16:24Z

If we want to compare the two competing consistent probability sampling mechanisms, we have to understand what they will entail in all processing stages. I believe the re-sampling algorithms will be very similar in principle, but there could be differences in complexity or accuracy of the results.

Yes, this is true. For example, for "dynamic sampling" it is probably much simpler to estimate quantiles if there is a power-of-two discretization as you could simply aggregate into a histogram as there are only a small number of relevant values. However, any sampling stage is free to use just power-of-two sampling probabilities, if it is more efficient. Even, if spans come with non-power-of-two t-values, they could be easily downsampled to the next power of two in a first step.

jmacd · 2023-05-25T16:07:30Z

New developments discussed in the Sampling SIG today.

We proposed "s-value" as a mechanism to encode the accumulation of independent non-consistent sampling stage adjusted counts. t-value and s-value would be separate fields, both included in tracestate for consistency. Existing vendor-specific sampling probabilities with unknown-and-presumed-independent sampling mechanisms will encode probability or adjusted count (as with t-value encoding) using tracestate s-value.

In the coming week I will resolve the conversations above. The plan is to draft a proposed change to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/tracestate-probability-sampling.md and document/justify the changes in this OTEP. Most of the existing specification will be re-used.

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>

jmacd · 2023-06-01T00:48:10Z

I've updated the document making "r-value" optional (and 56-bits) and making "s-value" for independent non-consistent sampling. I've incorporated some helpful feedback from @kalyanaj. Thanks.

oertl · 2023-06-01T05:52:10Z

text/trace/0226-sampling-random-traceids.md

+TraceState field.  It would be appropriate to name this field
+`LogState`.
+
+This proposal does makes r-value an optional 56-bit number as opposed


typo: This proposal makes...

oertl · 2023-06-01T06:02:13Z

text/trace/0226-sampling-random-traceids.md

+
+To calculate the Sampling threshold, we began with an IEEE-754
+standard double-precision floating point number.  With 52-bits of
+significand and a floating exponent, the probability value used to


With 52-bits of significand...

Double-precision floating-point values have a 52-bit mantissa but are able to represent 53-bit significands (except for subnormal values). See https://cs.stackexchange.com/a/152267/102560.

kentquirk · 2023-06-01T14:40:34Z

text/trace/0226-sampling-random-traceids.md

-   value in big-endian byte order.
-2. The sampling probability (range `[0x1p-56, 1]`) is multipled by
+1. When the r-value is present and parses as a 56-bit random value,
+   use it, otherwise bytes 10-16 of the TraceID are interpreted as a


It is worth specifying whether this counts from 0 or 1, or, even better, including an annotated traceID here, just for clarity.

kentquirk · 2023-06-01T14:59:08Z

text/trace/0226-sampling-random-traceids.md

+1. When the r-value is present and parses as a 56-bit random value,
+   use it, otherwise bytes 10-16 of the TraceID are interpreted as a
+   56-bit random value in big-endian byte order
+2. The sampling probability (range `[0x1p-56, 1]`) is multiplied by


I think most readers will be unfamiliar with floating point hex notation (I was) and this is probably needlessly terse. One way to express it would be (0, 1], but that also might be too confusing. Perhaps "greater than 0 and less than or equal to 1" or even 0 < n <= 1?

Similarly below, I might say 2^56 rather than using the hex notation.

kentquirk · 2023-06-01T15:05:26Z

text/trace/0226-sampling-random-traceids.md

+
+### 10% probability sampling twice
+
+The tracestate value `ot=s:0.01` corresponds with 10% sampling by one


Maybe expand this to show how the tracestate would be modified at each stage?

kentquirk · 2023-06-03T18:32:48Z

@jmacd , @kalyanaj , @oertl , @PeterF778

As promised in the SIG meeting, I have implemented a proposal for how to calculate threshold, sampling rate, and sampling probability in several different languages (Go, JavaScript, and Python).

The proposal is simple -- here's the Go version:

func threshold(tValue float64) int64 {
	const k = 0x1p+56
	if tValue < 1.0 {
		return int64(k*tValue + 0.5)
	}
	return int64(k/tValue + 0.5)
}

func samplingRate(tValue float64) int64 {
	if tValue < 1.0 {
		return int64(1.0/tValue + 0.5)
	}
	return int64(tValue + 0.5)
}

func samplingProbability(tValue float64) float64 {
	if tValue < 1.0 {
		return tValue
	}
	return 1.0 / tValue
}

All 3 languages seem to deliver identical results in the first 2 functions.

I was unable to convince all 3 languages to format floating point values identically (I probably could have by installing various libraries but I wanted to use the standard libraries). But they seemed to be identical through the first 14 digits.

Thoughts?

oertl · 2023-06-06T08:38:19Z

I wonder if we should prefer a canonical lossless (e.g. hex-encoded and only values less than or equal to 1) representation of the t-value that does not depend on platform- or language-dependent behavior (like the r-value). This would make encoding and parsing much simpler and less costly. For scenarios with multiple sampling stages, the t-value needs to be parsed and encoded frequently, and therefore this should be done as efficient as possible.

kentquirk · 2023-06-07T12:47:13Z

I understand your point, but I'm personally much more concerned about human usability. What I see from users in real sampling configurations are sampling rates like 1 in 10, 1 in 6, and 1 in 1000.

rate	hex value < 1
10	`0x1.999999999999ap-4`
6	`0x1.5555555555555p-03`
1000	`0x1.0624dd2f1a9fcp-10`

All of those representations lose the user's intent, while 10, 6, and 1000 do not.

There is also the problem that as far as I have found, many languages do not support hex representation in floating point. Or if they do, they do so differently.

I strongly feel that we should bias toward comprehensible and easily implementable human representations.

oertl · 2023-06-07T13:52:31Z

I understand your point, but I'm personally much more concerned about human usability. What I see from users in real sampling configurations are sampling rates like 1 in 10, 1 in 6, and 1 in 1000.

I also understand your point. But the limited number of bits simply does not allow to sample exactly 1 out of 3 consistently. In my opinion, it is more important to forward the actual sampling probability applied, not the user's intent. If you know that all your sampling configurations are of kind "1 out of X", it is relatively easy to reconstruct X from the applied sampling probability to make the user believe that it really was sampled as originally intended. Maybe an additional flag indicating that the reported t-value comes from a "1 out of X" rule could be a compromise.

In large systems, sampling probabilities are typically automatically chosen (e.g. based on rate limits), and it is more important that the parsing/encoding overhead is small.

There is also the problem that as far as I have found, many languages do not support hex representation in floating point. Or if they do, they do so differently.

It is easy to specify a lossless hex representation for the t-value. It could be defined as the integer threshold value used when comparing with the 56 random bits of the trace ID (or the optional r-value). If the t-value is an integer, it is straightforward to find a platform/language-independent hex representation. This definition would also reduce floating point operations as the sampling decision is simply the result of comparing the t-value with the random bits.

jmacd · 2023-06-15T01:53:40Z

I have drafted the introductory text of the changes I would propose in the tracestate-probability-sampling.md spec.

jmacd · 2023-09-06T16:52:34Z

This OTEP has been well-replaced by #235. Thanks @kentquirk!

Non-power-of-two consistent tail sampling draft proposal

4d3b94b

jmacd requested review from a team March 2, 2023 08:46

pr num

c3f1ed2

oertl reviewed Mar 2, 2023

View reviewed changes

PeterF778 reviewed Mar 3, 2023

View reviewed changes

text/trace/0226-sampling-random-traceids.md Outdated Show resolved Hide resolved

PeterF778 reviewed Mar 3, 2023

View reviewed changes

text/trace/0226-sampling-random-traceids.md Outdated Show resolved Hide resolved

PeterF778 reviewed Mar 3, 2023

View reviewed changes

text/trace/0226-sampling-random-traceids.md Outdated Show resolved Hide resolved

PeterF778 reviewed Mar 4, 2023

View reviewed changes

text/trace/0226-sampling-random-traceids.md Outdated Show resolved Hide resolved

yurishkuro reviewed Mar 4, 2023

View reviewed changes

text/trace/0226-sampling-random-traceids.md Outdated Show resolved Hide resolved

kalyanaj reviewed Mar 9, 2023

View reviewed changes

carlosalberto added priority:p1 triaged labels Mar 27, 2023

kalyanaj mentioned this pull request Mar 28, 2023

CR Request for W3C TraceContext Level 2 Specification w3c/transitions#495

Closed

draft updates based on recent SIG discussions

03f693c

drewby mentioned this pull request May 8, 2023

Collect subpolicy sampling data from composite policy open-telemetry/opentelemetry-collector-contrib#20849

Closed

New draft

df6b1d0

oertl reviewed May 11, 2023

View reviewed changes

kalyanaj reviewed May 15, 2023

View reviewed changes

jmacd mentioned this pull request May 17, 2023

[probabilisticsamplerprocessor] Support consistent intermediate span sampling (OTEP 226) open-telemetry/opentelemetry-collector-contrib#22058

Closed

atoulme mentioned this pull request May 24, 2023

Provide probabilistic sampling comparable to what is offered by probabilisticsamplerprocessor open-telemetry/opentelemetry-go#4130

Open

jmacd and others added 5 commits May 31, 2023 17:35

draft update

14ad23c

Update text/trace/0226-sampling-random-traceids.md

4380c6b

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>

edits

8940b66

Update text/trace/0226-sampling-random-traceids.md

9a5e9ce

Co-authored-by: J. Kalyana Sundaram <kalyanaj@microsoft.com>

from kalyana

cfa1b44

oertl reviewed Jun 1, 2023

View reviewed changes

kentquirk reviewed Jun 1, 2023

View reviewed changes

jmacd mentioned this pull request Aug 2, 2023

Probabilistic sampler processor based on draft t-value/r-value encoding open-telemetry/opentelemetry-collector-contrib#24811

Closed

jmacd closed this Sep 6, 2023

yangskyboxlabs mentioned this pull request Sep 17, 2024

Introduce environment variables for propagating context to subprocesses #263

Closed


		### 10% probability sampling twice

		The tracestate value `ot=s:0.01` corresponds with 10% sampling by one

Non-power-of-two consistent tail probability sampling in TraceState #226

Non-power-of-two consistent tail probability sampling in TraceState #226

Conversation

jmacd commented Mar 2, 2023

jmacd commented Mar 3, 2023

oertl commented Mar 6, 2023 • edited Loading

jmacd commented Mar 6, 2023

Choose a reason for hiding this comment

kalyanaj commented Mar 9, 2023

PeterF778 commented Mar 9, 2023

jmacd commented Apr 21, 2023 • edited Loading

oertl commented Apr 22, 2023

PeterF778 commented Apr 25, 2023

kalyanaj commented May 4, 2023

jmacd commented May 4, 2023

jmacd commented May 11, 2023

PeterF778 commented May 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PeterF778 commented May 15, 2023 • edited Loading

oertl commented May 16, 2023

PeterF778 commented May 16, 2023

oertl commented May 16, 2023

PeterF778 commented May 17, 2023

oertl commented May 17, 2023

jmacd commented May 25, 2023 • edited Loading

jmacd commented Jun 1, 2023

Choose a reason for hiding this comment

oertl Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kentquirk commented Jun 3, 2023

oertl commented Jun 6, 2023

kentquirk commented Jun 7, 2023

oertl commented Jun 7, 2023

jmacd commented Jun 15, 2023

jmacd commented Sep 6, 2023

oertl commented Mar 6, 2023 •

edited

Loading

jmacd commented Apr 21, 2023 •

edited

Loading

PeterF778 commented May 15, 2023 •

edited

Loading

jmacd commented May 25, 2023 •

edited

Loading

oertl Jun 1, 2023 •

edited

Loading