Skip to content

Commit

Permalink
Merge branch 'master' into grpc-update
Browse files Browse the repository at this point in the history
  • Loading branch information
ItalyPaleAle authored Nov 7, 2022
2 parents 3a95491 + a738f7f commit 70391ce
Show file tree
Hide file tree
Showing 15 changed files with 303 additions and 50 deletions.
23 changes: 23 additions & 0 deletions docs/release_notes/v1.9.1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Dapr 1.9.1

### Fixes mDNS name resolver component not working on systems with IPv6 disabled

#### Problem

Users running Dapr in self-hosted mode and relying on the mDNS name resolver (the default when not running in Kubernetes) would encounter errors trying to perform service invocation if the system had IPv6 disabled.

### Impact

This issue impacts users who are:

- running Dapr in self-hosted mode, *and*
- using the default mDNS name resolver, *and*
- have disabled IPv6 on the operating system

#### Root cause

When initializing a zeroconf (mDNS) client, Dapr required using both IPv4 and IPv6, and failed if either one of the two protocols was disabled on the host system.

#### Solution

We have improved error handling so Dapr gracefully falls back to using IPv4 if binding to IPv6 fails.
22 changes: 22 additions & 0 deletions docs/release_notes/v1.9.2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Dapr 1.9.2

### Fixes panics when using pubsub subscriptions or input bindings via gRPC with tracing disabled

#### Problem

Users who disabled tracing and are subscribing to Dapr pubsub components via gRPC or using input bindings via gRPC will encounter panics when an event is attempted to be delivered.

### Impact

This issue impacts users who:

- Disabled tracing by setting samplingRate to "0" in Dapr's configuration (by default, the value is "1")
- Use input bindings via gRPC and/or subscribe to pubsub components via gRPC

#### Root cause

Dapr 1.9.0 added support for tracing with OpenTelemetry, which uses a new SDK version. During the upgrade, an error was introduced causing a panic when tracing was disabled, due to a missing "nil-check".

#### Solution

We have added the missing `nil` checks to correctly handle the case where tracing is disabled.
28 changes: 28 additions & 0 deletions docs/release_notes/v1.9.3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Dapr 1.9.3

## Fixes traces not being reported in certain circumstances

### Problem

With tracing enabled, Dapr operators can choose the sampling rate using the `spec.tracing.sampling` option in the Dapr configuration. In Dapr 1.9.0-1.9.2, when a request coming to the Dapr runtime contains a `traceparent` header, the decision on whether to sample the request or not is solely based on the "sampling bit" set in the header's value.

This behavior was introduced with Dapr 1.9.0 during the transition to the new tracing framework based on the OpenTelemetry (OTEL) standard. Although this behavior is compliant with the W3C specs for distributed tracing, it caused Dapr to omit sending traces to the telemetry collector (e.g. Zipkin, Azure Monitor, etc) in many instances.

### Impact

The issue impacts users on Dapr 1.9.0-1.9.2 who have enabled collection of traces.

Based on reports from Dapr users, developers building apps with ASP.NET Core seem to be particularly impacted, as the .NET Core framework can automatically include a `traceparent` header in every request made to the Dapr sidecar which has the "sampling bit" set to `0` (disabled).

### Root cause

Older versions of Dapr (before 1.9.0) ignored the "sampling bit" in the `traceparent` header when making decisions on whether to sample a request. That behavior changed in Dapr 1.9.0, where the decision made by the caller with the `traceparent` header determines the Dapr runtime's sampling choice too.

### Solution

We have patched the way Dapr makes decisions on whether to sample a request and submit the trace to the telemetry collector (e.g. Zipkin, Azure Monitor, etc). The new behavior consists of:

- If the `traceparent` header has the "sampling bit" set to `1` (enabled), the request is always sampled.
- If the "sampling bit" is `0` (disabled), Dapr decides on whether to sample the request based on its own internal policies. For example, with `spec.tracing.sampling` set to `1`, all requests are traced; instead, a value of `0` generates no trace. Numbers in-between 0 and 1 cause Dapr to sample only a fraction of requests.

This behavior, while more lax, remains compliant with the W3C specs for distributed tracing.
14 changes: 7 additions & 7 deletions pkg/client/clientset/versioned/fake/register.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 7 additions & 7 deletions pkg/client/clientset/versioned/scheme/register.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

44 changes: 44 additions & 0 deletions pkg/diagnostics/tracing_sampler.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
package diagnostics

import (
"fmt"

sdktrace "go.opentelemetry.io/otel/sdk/trace"
"go.opentelemetry.io/otel/trace"

diagUtils "github.com/dapr/dapr/pkg/diagnostics/utils"
)

type DaprTraceSampler struct {
sdktrace.Sampler
ProbabilitySampler sdktrace.Sampler
SamplingRate float64
}

/**
* Decisions for the Dapr sampler are as follows:
* - parent has sample flag turned on -> sample
* - parent has sample turned off -> fall back to probability-based sampling
*/
func (d *DaprTraceSampler) ShouldSample(p sdktrace.SamplingParameters) sdktrace.SamplingResult {
psc := trace.SpanContextFromContext(p.ParentContext)
if psc.IsValid() && psc.IsSampled() {
// Parent is valid and specifies sampling flag is on -> sample
return sdktrace.AlwaysSample().ShouldSample(p)
}

// Parent is invalid or does not have sampling enabled -> sample probabilistically
return d.ProbabilitySampler.ShouldSample(p)
}

func (d *DaprTraceSampler) Description() string {
return fmt.Sprintf("DaprTraceSampler(P=%f)", d.SamplingRate)
}

func NewDaprTraceSampler(samplingRateString string) *DaprTraceSampler {
samplingRate := diagUtils.GetTraceSamplingRate(samplingRateString)
return &DaprTraceSampler{
SamplingRate: samplingRate,
ProbabilitySampler: sdktrace.TraceIDRatioBased(samplingRate),
}
}
122 changes: 121 additions & 1 deletion pkg/diagnostics/tracing_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,12 @@ package diagnostics
import (
"context"
"fmt"
"math/rand"
"sync"
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

"github.com/dapr/dapr/pkg/config"

Expand Down Expand Up @@ -138,7 +141,7 @@ func TestStartInternalCallbackSpan(t *testing.T) {
assert.NotEqual(t, "00f067aa0ba902b7", fmt.Sprintf("%x", spanID[:]))
})

t.Run("traceparent is provided but sampling is disabled", func(t *testing.T) {
t.Run("traceparent is provided with sampling flag = 1 but sampling is disabled", func(t *testing.T) {
traceSpec := config.TracingSpec{SamplingRate: "0"}

scConfig := trace.SpanContextConfig{
Expand All @@ -154,6 +157,66 @@ func TestStartInternalCallbackSpan(t *testing.T) {
assert.Nil(t, gotSp)
assert.NotNil(t, ctx)
})

t.Run("traceparent is provided with sampling flag = 0 and sampling is enabled (but not P=1.00)", func(t *testing.T) {
numTraces := 100000
sampledCount := runTraces(t, "test_trace", numTraces, "0.01", 0)
// We use a fixed seed for the RNG so we can use an exact number here
require.Equal(t, sampledCount, 1012, "Expected to sample 1012 traces but only sampled %d", sampledCount)
require.Less(t, sampledCount, numTraces, "Expected to sample fewer than the total number of traces, but sampled all of them!")
})

t.Run("traceparent is provided with sampling flag = 0 and sampling is enabled (and P=1.00)", func(t *testing.T) {
numTraces := 1000
sampledCount := runTraces(t, "test_trace", numTraces, "1.00", 0)
require.Equal(t, sampledCount, numTraces, "Expected to sample all traces (%d) but only sampled %d", numTraces, sampledCount)
})

t.Run("traceparent is provided with sampling flag = 1 and sampling is enabled (but not P=1.00)", func(t *testing.T) {
numTraces := 1000
sampledCount := runTraces(t, "test_trace", numTraces, "0.00001", 1)
require.Equal(t, sampledCount, numTraces, "Expected to sample all traces (%d) but only sampled %d", numTraces, sampledCount)
})
}

func runTraces(t *testing.T, testName string, numTraces int, samplingRate string, parentTraceFlag int) int {
d := NewDaprTraceSampler(samplingRate)
tracerOptions := []sdktrace.TracerProviderOption{
sdktrace.WithSampler(d),
}

tp := sdktrace.NewTracerProvider(tracerOptions...)

tracerName := fmt.Sprintf("%s_%s", testName, samplingRate)
otel.SetTracerProvider(tp)
tracer := otel.Tracer(tracerName)

// This is taken from otel's tests for the ratio sampler so we can generate IDs
idg := defaultIDGenerator()
sampledCount := 0

for i := 0; i < numTraces; i++ {
traceID, _ := idg.NewIDs(context.Background())
scConfig := trace.SpanContextConfig{
TraceID: traceID,
SpanID: trace.SpanID{0, 240, 103, 170, 11, 169, 2, 183},
TraceFlags: trace.TraceFlags(parentTraceFlag),
}

parent := trace.NewSpanContext(scConfig)

ctx := context.Background()
ctx = trace.ContextWithRemoteSpanContext(ctx, parent)
ctx, span := tracer.Start(ctx, "testTraceSpan", trace.WithSpanKind(trace.SpanKindClient))
assert.NotNil(t, span)
assert.NotNil(t, ctx)

if span.SpanContext().IsSampled() {
sampledCount += 1
}
}

return sampledCount
}

// This test would allow us to know when the span attribute keys are
Expand Down Expand Up @@ -221,3 +284,60 @@ func (o *otelFakeSpanProcessor) Shutdown(ctx context.Context) error {
func (o *otelFakeSpanProcessor) ForceFlush(ctx context.Context) error {
return nil
}

// This was taken from the otel testing to generate IDs
// origin: go.opentelemetry.io/otel/sdk@v1.11.1/trace/id_generator.go
// Copyright: The OpenTelemetry Authors
// License (Apache 2.0): https://github.com/open-telemetry/opentelemetry-go/blob/sdk/v1.11.1/LICENSE

// IDGenerator allows custom generators for TraceID and SpanID.
type IDGenerator interface {
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.

// NewIDs returns a new trace and span ID.
NewIDs(ctx context.Context) (trace.TraceID, trace.SpanID)
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.

// NewSpanID returns a ID for a new span in the trace with traceID.
NewSpanID(ctx context.Context, traceID trace.TraceID) trace.SpanID
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
}

type randomIDGenerator struct {
sync.Mutex
randSource *rand.Rand
}

var _ IDGenerator = &randomIDGenerator{}

// NewSpanID returns a non-zero span ID from a randomly-chosen sequence.
func (gen *randomIDGenerator) NewSpanID(ctx context.Context, traceID trace.TraceID) trace.SpanID {
gen.Lock()
defer gen.Unlock()
sid := trace.SpanID{}
_, _ = gen.randSource.Read(sid[:])
return sid
}

// NewIDs returns a non-zero trace ID and a non-zero span ID from a
// randomly-chosen sequence.
func (gen *randomIDGenerator) NewIDs(ctx context.Context) (trace.TraceID, trace.SpanID) {
gen.Lock()
defer gen.Unlock()
tid := trace.TraceID{}
_, _ = gen.randSource.Read(tid[:])
sid := trace.SpanID{}
_, _ = gen.randSource.Read(sid[:])
return tid, sid
}

func defaultIDGenerator() IDGenerator {
gen := &randomIDGenerator{
// Use a fixed seed to make the tests deterministic.
randSource: rand.New(rand.NewSource(1)), //nolint:gosec
}
return gen
}
5 changes: 0 additions & 5 deletions pkg/diagnostics/utils/trace_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,6 @@ func GetTraceSamplingRate(rate string) float64 {
return f
}

// TraceSampler returns Probability Sampler option.
func TraceSampler(samplingRate string) sdktrace.Sampler {
return sdktrace.ParentBased(sdktrace.TraceIDRatioBased(GetTraceSamplingRate(samplingRate)))
}

// IsTracingEnabled parses the given rate and returns false if sampling rate is explicitly set 0.
func IsTracingEnabled(rate string) bool {
return GetTraceSamplingRate(rate) != 0
Expand Down
3 changes: 2 additions & 1 deletion pkg/grpc/proxy/testserviceV1/test.v1.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 70391ce

Please sign in to comment.