Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/trace/api/otlp: support APM peer.service in OTLP #16490

Merged
merged 10 commits into from
Apr 27, 2023

Conversation

songy23
Copy link
Member

@songy23 songy23 commented Apr 7, 2023

What does this PR do?

Implement OTLP support for APM peer.service in trace agent:

  • Apply the peer.service defaults in cases where peer.service is missing.
    • Set the peer.service provenance tag _dd.peer.service.source={source_attribute}
  • Stop the current behavior where the service={peer.service} if peer.service is non-empty
  • Ensure span.kind is set correctly as the agent relies on this attribute to decide whether or not to calculate stats.

Motivation

To be compliant with the APM peer.service RFC and implementation (#16103).

Additional Notes

Possible Drawbacks / Trade-offs

This will be a breaking change to users who depend on overriding Service with span attribute peer.service.

Describe how to test/QA your changes

  1. Send an OTLP span with attribute peer.service to Agent, then verify the DD span meta tag _dd.peer.service.source is set and DD Service isn't overridden with peer.service.
  2. Send an OTLP span with attribute net.peer.name or any other default key to Agent, then verify the DD span meta tag
    peer.service and _dd.peer.service.source are set.
  3. In any of the spans above, verify meta tag span.kind is set.

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
  • If applicable, the config template has been updated.

@songy23 songy23 added component/otlp PRs and issues related to OTLP ingest team/opentelemetry OpenTelemetry team labels Apr 7, 2023
@songy23 songy23 added this to the 7.45.0 milestone Apr 7, 2023
@songy23 songy23 requested a review from jdgumz April 7, 2023 21:04
@songy23 songy23 requested review from a team as code owners April 7, 2023 21:04
@gbbr gbbr changed the title [pkg/trace/api/otlp] Support APM peer.service in OTLP pkg/trace/api/otlp: support APM peer.service in OTLP Apr 10, 2023
pkg/trace/api/otlp.go Outdated Show resolved Hide resolved
releasenotes/notes/otlp-peer-service-15c0b4a105.yaml Outdated Show resolved Hide resolved
Copy link
Member Author

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdgumz Just noticed that the RFC proposed to set _dd.peer.service.source while #16103 sets peer.service. Could you confirm what is the expected behavior?

pkg/trace/api/otlp.go Outdated Show resolved Hide resolved
releasenotes/notes/otlp-peer-service-15c0b4a105.yaml Outdated Show resolved Hide resolved
pkg/trace/api/otlp.go Outdated Show resolved Hide resolved
@songy23 songy23 modified the milestones: 7.45.0, 7.46.0 Apr 14, 2023
pkg/trace/api/otlp.go Outdated Show resolved Hide resolved
pkg/trace/api/otlp.go Outdated Show resolved Hide resolved
@pr-commenter
Copy link

pr-commenter bot commented Apr 17, 2023

Bloop Bleep... Dogbot Here

Regression Detector Results

Run ID: 3c2d7410-68d2-44dd-aa2c-429638ec4d87
Baseline: b9b2212
Comparison: 7d4dc82
Total datadog-agent CPUs: 7

Explanation

A regression test is an integrated performance test for datadog- agent in a repeatable rig, with varying configuration for datadog- agent. What follows is a statistical summary of a brief datadog- agent run for each configuration across SHAs given above. The goal of these tests are to determine quickly if datadog-agent performance is changed and to what degree by a pull request.

The table below, if present, lists those experiments that have experienced a statistically significant change in mean optimization goal performance between baseline and comparison SHAs with 90.00% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±5.00% change in their mean optimization goal are discarded. An experiment is erratic if its coefficient of variation is greater than 0.1. The abbreviated table will be omitted if no interesting change is observed.

No interesting changes in experiment optimization goals with confidence ≥ 90.00% and |Δ mean %| ≥ 5.00%.

Fine details of change detection per experiment.
experiment goal Δ mean Δ mean % confidence baseline mean baseline stdev baseline stderr baseline outlier % baseline CoV comparison mean comparison stdev comparison stderr comparison outlier % comparison CoV erratic declared erratic
otel_to_otel_logs ingress throughput 14.1MiB/CPU-s 1.85 100.00% 761.7MiB/CPU-s 44.26MiB/CPU-s 584.71KiB/CPU-s 0.0 0.058102 775.8MiB/CPU-s 34.8MiB/CPU-s 459.78KiB/CPU-s 0.0 0.044857 False False
uds_dogstatsd_to_api ingress throughput 1.12KiB/CPU-s 1.47 92.65% 76.07KiB/CPU-s 34.28KiB/CPU-s 450.39B/CPU-s 0.0 0.450682 77.18KiB/CPU-s 34.42KiB/CPU-s 452.08B/CPU-s 0.0 0.445868 True False
tcp_syslog_to_blackhole ingress throughput -3.55KiB/CPU-s -0.05 44.17% 7.07MiB/CPU-s 306.33KiB/CPU-s 3.95KiB/CPU-s 0.0 0.042329 7.06MiB/CPU-s 357.17KiB/CPU-s 4.61KiB/CPU-s 0.0 0.049378 False False
file_to_blackhole egress throughput -2.94B/CPU-s -0.29 47.79% 1.0KiB/CPU-s 85.35B/CPU-s 2.93B/CPU-s 0.0 0.083124 1023.22B/CPU-s 102.59B/CPU-s 3.52B/CPU-s 0.0 0.100205 True False
tcp_dd_logs_filter_exclude ingress throughput -1.62MiB/CPU-s -1.02 100.00% 158.8MiB/CPU-s 8.71MiB/CPU-s 114.23KiB/CPU-s 0.0 0.054818 157.18MiB/CPU-s 11.33MiB/CPU-s 148.66KiB/CPU-s 0.0 0.072073 False False

@songy23
Copy link
Member Author

songy23 commented Apr 26, 2023

Finally was able to build a local agent (thanks to @dineshg13 !). Tested in staging that non top level OTLP spans also get trace metrics:

Screenshot 2023-04-26 at 3 55 27 PM

{
  "org_id":47678,
  "trace_id":"7817017947528822850",
  "span_id":"10739693701756824938",
  "parent_id":"15283232591715693939",
  "start":1682538833.124398,
  "end":1682538833.124401,
  "duration":0.000003291,
  "type":"http",
  "service":"java-test-app",
  "name":"TestSpan2",
  "resource":"TestSpan2",
  "resource_hash":"46bd36d900e12da6",
  "host_id":3013233129,
  "hostname":"COMP-YN6VVDJCY6",
  "env":"yang-otlp-test",
  "host_groups":[
    "env:yang-otlp-test"
  ],
  "meta":{
    "_dd.filter.type":"spans-sampling-processor",
    "_dd.agent_version":"7.45.0-rc.1+git.12.7d4dc82",
    "otel.status_code":"Unset",
    "env":"yang-otlp-test",
    "_dd.agent_rare_sampler.enabled":"false",
    "_dd.agent_hostname":"COMP-YN6VVDJCY6",
    "_dd.peer.service.source":"peer.service",
    "_dd.ingestion_reason":"otel",
    "_dd.p.dm":"-9",
    "span.kind":"client",
    "_dd.tracer_version":"otlp-",
    "_dd.filter.id":"w2wqZjVmSx22li16Kl5LDA",
    "otel.library.name":"io.opentelemetry.javaagent.spring",
    "peer.service":"peer-svc",
    "otel.library.version":"1.0.0",
    "_dd.hostname":"COMP-YN6VVDJCY6",
    "otel.trace_id":"ee5953f4ed09b50f6c7ba0682d4f0042",
    "deployment.environment":"yang-otlp-test"
  },
  "metrics":{
    "_dd.filter.kept":1,
    "_sampling_priority_v1":1,
    "_dd1.sr.esusr":1,
    "_dd.agent_errors_sampler.target_tps":10,
    "_top_level":1,
    "_dd.otlp_sr":1,
    "_dd.agent_priority_sampler.target_tps":10
  },
  "ingestion_reason":"otel",
  "children_ids":[
    
  ]
}

@songy23 songy23 merged commit 20bc179 into main Apr 27, 2023
@songy23 songy23 deleted the yang.song/otlp-peer-svc branch April 27, 2023 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/otlp PRs and issues related to OTLP ingest team/opentelemetry OpenTelemetry team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants