Skip to content

Commit

Permalink
[connector/datadog] Add trace configs that mirror datadog exporter (o…
Browse files Browse the repository at this point in the history
…pen-telemetry#30787)

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
  • Loading branch information
songy23 and mx-psi authored Jan 30, 2024
1 parent 2f55fec commit 9b4caa8
Show file tree
Hide file tree
Showing 13 changed files with 1,027 additions and 26 deletions.
33 changes: 33 additions & 0 deletions .chloggen/dd-connector-configs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: datadogconnector

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add trace configs that mirror datadog exporter

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [30787]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
ignore_resources: disable certain traces based on their resource name
span_name_remappings: map of datadog span names and preferred name to map to
span_name_as_resource_name: use OTLP span name as datadog operation name
compute_stats_by_span_kind: enables an additional stats computation check based on span kind
peer_tags_aggregation: enables aggregation of peer related tags
trace_buffer: specifies the buffer size for datadog trace payloads
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: []
60 changes: 60 additions & 0 deletions connector/datadogconnector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,66 @@ service:
Here we have two traces pipelines that ingest the same data but one is being sampled. The one that is sampled has its data sent to the datadog backend for you to see the sampled subset of the total traces sent across. The other non-sampled pipeline of traces sends its data to the metrics pipeline to be used in the APM stats. This unsampled pipeline gives the full picture of how much data the application emits in traces.
## Configurations
```yaml
connectors:
datadog/connector:
traces:
## @param ignore_resources - list of strings - optional
## A blacklist of regular expressions can be provided to disable certain traces based on their resource name
## all entries must be surrounded by double quotes and separated by commas.
#
# ignore_resources: ["(GET|POST) /healthcheck"]

## @param span_name_remappings - map of key/value pairs - optional
## A map of Datadog span operation name keys and preferred name valuues to update those names to. This can be used to
## automatically map Datadog Span Operation Names to an updated value, and is useful when a user wants to
## shorten or modify span names to something more user friendly in the case of instrumentation libraries with
## particularly verbose names.
#
# span_name_remappings:
# io.opentelemetry.javaagent.spring.client: spring.client
# instrumentation:express.server: express
# go.opentelemetry.io_contrib_instrumentation_net_http_otelhttp.client: http.client

## @param span_name_as_resource_name - use OpenTelemetry semantic convention for span naming - optional
## Option created to maintain similarity with the OpenTelemetry semantic conventions as discussed in the issue below.
## https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions
## https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/1909
#
# span_name_as_resource_name: true

## @param compute_stats_by_span_kind - enables APM stats computation based on `span.kind` - optional
## If set to true, enables an additional stats computation check on spans to see they have an eligible `span.kind` (server, consumer, client, producer).
## If enabled, a span with an eligible `span.kind` will have stats computed. If disabled, only top-level and measured spans will have stats computed.
## NOTE: For stats computed from OTel traces, only top-level spans are considered when this option is off.
#
# compute_stats_by_span_kind: true

## @param peer_tags_aggregation - enables aggregation of peer related tags in Datadog exporter - optional
## If set to true, enables aggregation of peer related tags (e.g., `peer.service`, `db.instance`, etc.) in Datadog exporter.
## If disabled, aggregated trace stats will not include these tags as dimensions on trace metrics.
## For the best experience with peer tags, Datadog also recommends enabling `compute_stats_by_span_kind`.
## If you are using an OTel tracer, it's best to have both enabled because client/producer spans with relevant peer tags
## may not be marked by Datadog exporter as top-level spans.
## If enabling both causes Datadog exporter to consume too many resources, try disabling `compute_stats_by_span_kind` first.
## A high cardinality of peer tags or APM resources can also contribute to higher CPU and memory consumption.
## You can check for the cardinality of these fields by making trace search queries in the Datadog UI.
## The default list of peer tags can be found in https://github.com/DataDog/datadog-agent/blob/main/pkg/trace/stats/concentrator.go.
#
# peer_tags_aggregation: false

## @param trace_buffer - specifies the number of outgoing trace payloads to buffer before dropping - optional
## If unset, the default value is 1000.
## If you start seeing log messages like `Payload in channel full. Dropped 1 payload.` in the datadog exporter, consider
## setting a higher `trace_buffer` to avoid traces being dropped.
#
# trace_buffer: 1000
```

**NOTE**: `compute_stats_by_span_kind` and `peer_tags_aggregation` only work when the feature gate `connector.datadogconnector.performance` is enabled. See below for details on this feature gate.

## Feature Gate for Performance

In case you are experiencing high memory usage with Datadog Connector, similar to [issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/29755), use the feature gate `connector.datadogconnector.performance`. With the feature gate enabled, Datadog Connector takes OTLP traces and produces OTLP metric with the name `dd.internal.stats.payload`. This Metric has an attribute `dd.internal.stats.payload` that contains the bytes for StatsPayload. With the feature gate, we can use Datadog Connector only in conjunction with Datadog Exporter. Please enable the feature only if needed for performance reasons and higher throughput. Enable the feature gate on all collectors (especially in gateway deployment) in the pipeline that sends data to Datadog. We plan to refactor this component in the future so that the signals produced are usable in any metrics pipeline.
Expand Down
90 changes: 90 additions & 0 deletions connector/datadogconnector/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package datadogconnector // import "github.com/open-telemetry/opentelemetry-collector-contrib/connector/datadogconnector"

import (
"fmt"
"regexp"

"go.opentelemetry.io/collector/component"
)

var _ component.Config = (*Config)(nil)

// Config defines configuration for the Datadog connector.
type Config struct {
// Traces defines the Traces specific configuration
Traces TracesConfig `mapstructure:"traces"`
}

// TracesConfig defines the traces specific configuration options
type TracesConfig struct {
// ignored resources
// A blocklist of regular expressions can be provided to disable certain traces based on their resource name
// all entries must be surrounded by double quotes and separated by commas.
// ignore_resources: ["(GET|POST) /healthcheck"]
IgnoreResources []string `mapstructure:"ignore_resources"`

// SpanNameRemappings is the map of datadog span names and preferred name to map to. This can be used to
// automatically map Datadog Span Operation Names to an updated value. All entries should be key/value pairs.
// span_name_remappings:
// io.opentelemetry.javaagent.spring.client: spring.client
// instrumentation:express.server: express
// go.opentelemetry.io_contrib_instrumentation_net_http_otelhttp.client: http.client
SpanNameRemappings map[string]string `mapstructure:"span_name_remappings"`

// If set to true the OpenTelemetry span name will used in the Datadog resource name.
// If set to false the resource name will be filled with the instrumentation library name + span kind.
// The default value is `false`.
SpanNameAsResourceName bool `mapstructure:"span_name_as_resource_name"`

// If set to true, enables an additional stats computation check on spans to see they have an eligible `span.kind` (server, consumer, client, producer).
// If enabled, a span with an eligible `span.kind` will have stats computed. If disabled, only top-level and measured spans will have stats computed.
// NOTE: For stats computed from OTel traces, only top-level spans are considered when this option is off.
ComputeStatsBySpanKind bool `mapstructure:"compute_stats_by_span_kind"`

// If set to true, enables aggregation of peer related tags (e.g., `peer.service`, `db.instance`, etc.) in the datadog connector.
// If disabled, aggregated trace stats will not include these tags as dimensions on trace metrics.
// For the best experience with peer tags, Datadog also recommends enabling `compute_stats_by_span_kind`.
// If you are using an OTel tracer, it's best to have both enabled because client/producer spans with relevant peer tags
// may not be marked by the datadog connector as top-level spans.
// If enabling both causes the datadog connector to consume too many resources, try disabling `compute_stats_by_span_kind` first.
// A high cardinality of peer tags or APM resources can also contribute to higher CPU and memory consumption.
// You can check for the cardinality of these fields by making trace search queries in the Datadog UI.
// The default list of peer tags can be found in https://github.com/DataDog/datadog-agent/blob/main/pkg/trace/stats/concentrator.go.
PeerTagsAggregation bool `mapstructure:"peer_tags_aggregation"`

// TraceBuffer specifies the number of Datadog Agent TracerPayloads to buffer before dropping.
// The default value is 1000.
TraceBuffer int `mapstructure:"trace_buffer"`
}

// Validate the configuration for errors. This is required by component.Config.
func (c *Config) Validate() error {
if c.Traces.IgnoreResources != nil {
for _, entry := range c.Traces.IgnoreResources {
_, err := regexp.Compile(entry)
if err != nil {
return fmt.Errorf("%q is not valid resource filter regular expression", entry)
}
}
}

if c.Traces.SpanNameRemappings != nil {
for key, value := range c.Traces.SpanNameRemappings {
if value == "" {
return fmt.Errorf("%q is not valid value for span name remapping", value)
}
if key == "" {
return fmt.Errorf("%q is not valid key for span name remapping", key)
}
}
}

if c.Traces.TraceBuffer < 0 {
return fmt.Errorf("Trace buffer must be non-negative")
}

return nil
}
78 changes: 78 additions & 0 deletions connector/datadogconnector/config_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package datadogconnector

import (
"testing"

"github.com/stretchr/testify/assert"
)

func TestValidate(t *testing.T) {

tests := []struct {
name string
cfg *Config
err string
}{
{
name: "span name remapping valid",
cfg: &Config{
Traces: TracesConfig{
SpanNameRemappings: map[string]string{"old.opentelemetryspan.name": "updated.name"},
},
},
},
{
name: "span name remapping empty val",
cfg: &Config{Traces: TracesConfig{
SpanNameRemappings: map[string]string{"oldname": ""},
}},
err: "\"\" is not valid value for span name remapping",
},
{
name: "span name remapping empty key",
cfg: &Config{Traces: TracesConfig{
SpanNameRemappings: map[string]string{"": "newname"},
}},
err: "\"\" is not valid key for span name remapping",
},
{
name: "ignore resources valid",
cfg: &Config{Traces: TracesConfig{
IgnoreResources: []string{"[123]"},
}},
},
{
name: "ignore resources missing bracket",
cfg: &Config{Traces: TracesConfig{
IgnoreResources: []string{"[123"},
}},
err: "\"[123\" is not valid resource filter regular expression",
},
{
name: "With trace_buffer",
cfg: &Config{Traces: TracesConfig{
TraceBuffer: 10,
}},
},
{
name: "neg trace_buffer",
cfg: &Config{Traces: TracesConfig{
TraceBuffer: -10,
}},
err: "Trace buffer must be non-negative",
},
}
for _, testInstance := range tests {
t.Run(testInstance.name, func(t *testing.T) {
err := testInstance.cfg.Validate()
if testInstance.err != "" {
assert.EqualError(t, err, testInstance.err)
} else {
assert.NoError(t, err)
}
})
}
}
18 changes: 16 additions & 2 deletions connector/datadogconnector/connector.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"fmt"

pb "github.com/DataDog/datadog-agent/pkg/proto/pbgo/trace"
traceconfig "github.com/DataDog/datadog-agent/pkg/trace/config"
"github.com/DataDog/opentelemetry-mapping-go/pkg/otlp/attributes"
"github.com/DataDog/opentelemetry-mapping-go/pkg/otlp/metrics"
"go.opentelemetry.io/collector/component"
Expand Down Expand Up @@ -45,7 +46,7 @@ type connectorImp struct {
var _ component.Component = (*connectorImp)(nil) // testing that the connectorImp properly implements the type Component interface

// function to create a new connector
func newConnector(set component.TelemetrySettings, _ component.Config, metricsConsumer consumer.Metrics, tracesConsumer consumer.Traces) (*connectorImp, error) {
func newConnector(set component.TelemetrySettings, cfg component.Config, metricsConsumer consumer.Metrics, tracesConsumer consumer.Traces) (*connectorImp, error) {
set.Logger.Info("Building datadog connector")

in := make(chan *pb.StatsPayload, 100)
Expand All @@ -62,7 +63,7 @@ func newConnector(set component.TelemetrySettings, _ component.Config, metricsCo
ctx := context.Background()
return &connectorImp{
logger: set.Logger,
agent: datadog.NewAgent(ctx, in),
agent: datadog.NewAgentWithConfig(ctx, getTraceAgentCfg(cfg.(*Config).Traces), in),
translator: trans,
in: in,
metricsConsumer: metricsConsumer,
Expand All @@ -71,6 +72,19 @@ func newConnector(set component.TelemetrySettings, _ component.Config, metricsCo
}, nil
}

func getTraceAgentCfg(cfg TracesConfig) *traceconfig.AgentConfig {
acfg := traceconfig.New()
acfg.OTLPReceiver.SpanNameRemappings = cfg.SpanNameRemappings
acfg.OTLPReceiver.SpanNameAsResourceName = cfg.SpanNameAsResourceName
acfg.Ignore["resource"] = cfg.IgnoreResources
acfg.ComputeStatsBySpanKind = cfg.ComputeStatsBySpanKind
acfg.PeerTagsAggregation = cfg.PeerTagsAggregation
if v := cfg.TraceBuffer; v > 0 {
acfg.TraceBuffer = v
}
return acfg
}

// Start implements the component.Component interface.
func (c *connectorImp) Start(_ context.Context, _ component.Host) error {
c.logger.Info("Starting datadogconnector")
Expand Down
66 changes: 66 additions & 0 deletions connector/datadogconnector/example_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package datadogconnector

import (
"testing"

"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/connector"
"go.opentelemetry.io/collector/exporter"
"go.opentelemetry.io/collector/exporter/debugexporter"
"go.opentelemetry.io/collector/otelcol"
"go.opentelemetry.io/collector/otelcol/otelcoltest"
"go.opentelemetry.io/collector/processor"
"go.opentelemetry.io/collector/processor/batchprocessor"
"go.opentelemetry.io/collector/receiver"
"go.opentelemetry.io/collector/receiver/otlpreceiver"

"github.com/open-telemetry/opentelemetry-collector-contrib/exporter/datadogexporter"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor"
)

func TestExamples(t *testing.T) {
t.Setenv("DD_API_KEY", "testvalue")
factories := newTestComponents(t)
const configFile = "./examples/config.yaml"
_, err := otelcoltest.LoadConfigAndValidate(configFile, factories)
require.NoError(t, err, "All yaml config must validate. Please ensure that all necessary component factories are added in newTestComponents()")
}

// newTestComponents returns the minimum amount of components necessary for
// running a collector with any of the examples/* yaml configuration files.
func newTestComponents(t *testing.T) otelcol.Factories {
var (
factories otelcol.Factories
err error
)
factories.Receivers, err = receiver.MakeFactoryMap(
[]receiver.Factory{
otlpreceiver.NewFactory(),
}...,
)
require.NoError(t, err)
factories.Processors, err = processor.MakeFactoryMap(
[]processor.Factory{
batchprocessor.NewFactory(),
tailsamplingprocessor.NewFactory(),
}...,
)
require.NoError(t, err)
factories.Connectors, err = connector.MakeFactoryMap(
[]connector.Factory{
NewFactory(),
}...,
)
require.NoError(t, err)
factories.Exporters, err = exporter.MakeFactoryMap(
[]exporter.Factory{
datadogexporter.NewFactory(),
debugexporter.NewFactory(),
}...,
)
require.NoError(t, err)
return factories
}
Loading

0 comments on commit 9b4caa8

Please sign in to comment.