feat: `service.target.*` to improve backend granularity #2882

trentm · 2022-08-17T19:57:32Z

This adds the new span.context.service.target.{type,name} fields for improved granularity on backend services data in exit spans. This data is used for "Service Maps" and "Dependencies" in the Kibana APM app.

All instrumentations have been updated to set appropriate service target values. service.target.* is typically inferred automatically from other span data, so much of the instrumentation work was in adding other span fields, most commonly span.context.db.instance. The one current exception to the general inference algorithm is S3 spans, for which the spec'd service.target.name doesn't follow the general pattern.

A new, public span.setServiceTarget(type, name) API has been added (this is a "SHOULD" in the spec).

The new fields will replace the (now deprecated) span.context.destination.service.* fields. In the current stage of transition:

destination.service.{type,name} are set to the empty string. They are no longer used, but the intake API v2 schema up to 7.13 require them to be set.
Setting destination.service.resource directly is discouraged. Typically it is inferred from service.target.* values when a span is ended. Again the one exception is S3 spans.

The not-public-but-available-because-JavaScript span.setDestinationContext has been deprecated (using it will process.emitWarning()) and replaced with an internal span._setDestinationContext().

As part of this change, improvements have been made to some module instrumentations:

redis and ioredis: span.type has changed from "cache" to "db" per spec (https://github.com/elastic/apm/blob/main/specs/agents/tracing-instrumentation-db.md#redis)
mongodb: span.action used to be "query", now it will be the mongodb command name, e.g. "find", "insert".
mongodb and mongodb-core: span.db.instance is now set to the database name (mongo needs span.db.instance support #1494)
mysql and mysql2: span.db.{instance,user} are now populated.
@elastic/elasticsearch: The cluster name is heuristically determined for Elastic Cloud deployments and used for db.instance.
sqs: span.destination.{address,port} are now populated.
pg: span.db.{instance,user} are now populated.
cassandra-driver: the Cassandra keyspace is captured for service target data, if available.
OpenTelemetry Bridge: OTel spans with kind PRODUCER and CLIENT are now handled as exit spans (e.g. span compression could apply).

Closes: #2621
Closes: #2822
Closes: #1494
Closes: #1897
Closes: #2103
Obsoletes: #2458

Notes for reviewers

This changes a lot of files. The following might help break that down to reasonable chunks:

The main part of the change is the general algorithm to set span.context.service.target.* in Span.end(), and then the algorithm to infer span.context.destination.service.resource from the service.target value, also in Span.end(). This is all in "lib/instrumentation/span.js".
Those two algorithms are being clarified in this spec PR: Backend granularity (service.target.*) clarifications apm#674 Mostly that PR is putting pseudo-code in the specs to match what is implemented in the Java agent.
Changes to specific instrumentations are listed above, but most of the changes are:
- Because destination.service.resource is inferred automatically, instrumentations for all exit spans changed to no longer specify destination.service in the call to span.setDestinationContext().
- Also, the algo to set service.target.* uses context.db.instance if available, so many of the instrumentations were updated to determine db.instance according to spec.
"lib/instrumentation/span-compression.js" and "lib/instrumentation/dropped-span-stats.js" were updated per spec to use service.target.* fields.
The tests. Most test file updates are just updating service.target.* and destination.service.* expectations. "Interesting" test files are:
- "elasticsearch.test.js" testing the handling of the 'x-found-handling-cluster' header for our partial implementation of https://github.com/elastic/apm/blob/main/specs/agents/tracing-instrumentation-db.md#cluster-name
- "test/instrumentation/modules/cassandra-driver/_utils.js" improves the Cassandra test utils to separate "keyspace" and "table" for more cleaning testing those. This is more important now that keyspace is used for db.instance if available.
- "test/opentelemetry-bridge/otel-bridge-feature.test.js" is a painstaking manual update to the latest "otel_bridge.feature" gherkin spec from apm.git that we cannot use directly
- "test/service-resource-inference.test.js" is new and adds testing of the shared "json-specs/service_resource_inference.json" from apm.git

This adds support for the new "span.context.service.target.*" fields, and updates handling of the now deprecated "span.context.destination.service.*" fields to be calculated automatically. Closes: #2621 ---- Changes in this first commit: - Add the initial algorithm for setting `span.context.service.target.*` for exit spans, and for inferring `span.context.destination.service.*` as well. Still discussing some details/clarifications at elastic/apm#674 - Add a public `Span#setServiceTarget(type, name)`. (Still need to add docs and index.d.ts entry). - Deprecate `Span#setDestinationContext()` and add a new `_setDestinationContext()` for internal usage. The new one should no longer receive `.service.*` because that's handled automatically in Span#end(). Still keeping the old one, even though it was always internal because a user *could* have been using it, and I *could* have aided in that usage (https://gist.github.com/trentm/8c8fcecbec1a99ce0fbb415ef87ae2db) in answering a support question. - Convert redis over to using the new `_setDestinationContext(), as the first guinea pig. - *Also* update redis instrumentation values to match spec (https://github.com/elastic/apm/blob/main/specs/agents/tracing-instrumentation-db.md#redis): - span.type: "db", was "cache" - span.action: "query", was null - add span.context.db

…ests

…ionContext in tests

apmmachine · 2022-08-18T01:14:59Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-09-14T18:47:42.993+0000
Duration: 28 min 32 sec

Test stats 🧪

Test	Results
Failed	0
Passed	270348
Skipped	0
Total	270348

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
run module tests for <modules> : Run TAV tests for one or more modules, where <modules> can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules
run benchmark tests : Run the benchmark test only.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

… set for exit spans

…ions that impact getting http response headers with this config

….git

…esult.headers=null, cope with that

trentm · 2022-09-09T22:00:16Z

run module tests for @elastic/elasticsearch

…se it for the 's3' special case of no 's3/' prefix

…method Ideally node's warning system would offer (and even default) to showing a given DeprecationWarning with a given code just *once* to not clutter so much. Also... supporting user easily filtering out these warnings in code. They are limited to using the 'warning' event *and* need to then use `node --no-warnings`. We can always revert this if too noisy, and make it a log.warn, say.

trentm · 2022-09-10T16:59:13Z

run module tests for ALL

trentm · 2022-09-11T20:56:02Z

run module tests for ALL

That didn't take that I could see. I've manually started https://apm-ci.elastic.co/job/apm-agent-nodejs/job/apm-agent-nodejs-mbp/job/PR-2882/30/ to do a full TAV run.

trentm · 2022-09-12T18:15:45Z

lib/constants.js

@@ -9,6 +9,9 @@
 */

 module.exports = {
+  // The default span or transaction `type`.
+  DEFAULT_SPAN_TYPE: 'custom',


REVIEW NOTE: The spec says the default span.type should be 'custom'. We had been doing that in Span serialization. However, when inferring span.context.service.target.type in Span.end(), the algorithm needs that span.type value. One way was to have a shared constant -- which I've done here. Another way would be to actually set span.type = 'custom' as a default.

astorm

This is pretty unwieldy and large. I don't think I can digest it for a full top-to-bottom review in a reasonable amount of time. I've taken a few passes and left some notes. The changes to a both a large number of implementation and a large number of tests at the same time make me a little uncertain of how to ensure we're not creating some small side effects that are going to cause behavior changes for users.

That said -- from a stability point of view the changes appear solid. The only place where I see potential for new null references are the changes to getDBDestination (where we can now return a null value) -- but our own use of this function is to set a span's _destination value, and we appear to have appropriate guard clauses on _destination to ensure no null references. I don't think these changes will cause any new crashes and should be OK to release as is, so I'm approving.

astorm · 2022-09-13T20:55:47Z

lib/instrumentation/span.js

+Span.prototype._setDestinationContext = function (destCtx) {
+  this._destination = Object.assign(this._destination || {}, destCtx)
+}


This is a longer term discussion that goes beyond this PR, but we don't seem to be super consistent on the semantic meaning of method names with a leading underscore.

This usually means the method is meant to be analogous to a protected or private method in a language with access modifiers.

However, we're calling _setDestinationContext as a public method in the instrumentation modules so in this instance we seem to be saying that _setDestinationContext is a public method but just not a public method that's part of our API.

If we're going to adopt the leading underscore as a convention is would be good to get on the same page as to what it means and more generally how we think about enforcing API boundaries between our own modules.

I responded a little bit below (#2882 (comment)) as well.

Mostly this use of leading underscore to mean "internal to the agent" seemed inline with current common uses of the following in the instrumentation modules:

agent._instrumentation agent._conf span._setOutcomeFromHttpStatusCode span._setOutcomeFromErrorCapture

However, documenting naming conventions would be good.
I'll bring up a separate discussion on conventions for this.

astorm · 2022-09-13T20:56:21Z

lib/instrumentation/span.js

+      }
+    }
+  }
+


Dropping 70+ lines into an already busy method seems like -- a lot. This new logic might be better off in its own method or private module helper function.

Good call. I'll move those out to a separate function or method.

astorm · 2022-09-13T20:56:49Z

lib/instrumentation/span.js

+ *    Internal APM agent code should use `_setDestinationContext()`.
+ */
+Span.prototype.setDestinationContext = function (destCtx) {
+  process.emitWarning(


I don't know how necessary this warning is since setDestinationContext was never documented as a part of our public API. This is another example of us not being clear about what is and isn't a public API.

By emitting this warning we seem to be saying this was a part of our public API.

However, by not documenting it we were saying it's not part of our public API.

By not having a CHANGELOG entry for this change we seem to be saying it was never a part of our public API.

It we want to start emitting these sorts of warnings when we deprecate something I don't object -- I'm just ring the "if we're going to be this fiddly we should come up with some rules" rather than making adhoc decisions about how we're doing things on a PR by PR basis.

I don't know how necessary this warning is

My goal in adding this warning is to attempt to help the accidental user of this never-public method to (a) notice and (b) have time to migrate to the new, public .setServiceTarget().

While these <span>.set<Something>Context() methods have never been part of our public API, they sure look appealing for those attempting to do manual instrumentation:

Those users might not pay attention to the TypeScript-y index.d.ts,

those users might not rely solely on the API docs for what APIs they can use,

there aren't any public APIs to use to set some of these context attributes on spans to access features like the "Service Map" and "Dependencies" in APM UI, and

that these methods are not prefixed with an underscore hints that perhaps they are fair game to use.

On that last point, perhaps I am showing my Python heritage in using a leading underscore as a hint that an attribute is non-public. Though that leading-underscore practice is used frequently in Node.js-core code as well.

Also, I feel some duty to the customer to whom I offered this code to use for manual instrumentation of Oracle (which we don't support yet): https://gist.github.com/trentm/8c8fcecbec1a99ce0fbb415ef87ae2db
That code uses .setDbContext() and .setDestinationContext().

By not having a CHANGELOG entry for this change we seem to be saying it was never a part of our public API.

I could certainly add a message to the changelog about this. I'll do that. I had been intending to add the following to the commit message (which is already in the PR description) -- but in the changelog is good too:

The not-public-but-available-because-JavaScript `span.setDestinationContext` has been deprecated (using it will `process.emitWarning()`) and replaced with an internal `span._setDestinationContext()`.

By emitting this warning we seem to be saying this was a part of our public API.

I'm changing the warning message to the following to attempt to alleviate that concern slightly:

'<span>.setDestinationContext() was never a public API and will be removed, use <span>.setServiceTarget().',

"if we're going to be this fiddly we should come up with some rules" rather than making adhoc decisions

Sounds good. I'll bring up a discussion separately. Perhaps it can lead to developer guide content in DEVELOPMENT.md

This adds the new `span.context.service.target.{type,name}` fields for improved granularity on backend services data in exit spans. This data is used for "Service Maps" and "Dependencies" in the Kibana APM app. All instrumentations have been updated to set appropriate service target values. `service.target.*` is typically inferred automatically from other span data, so much of the instrumentation work was in adding other span fields, most commonly `span.context.db.instance`. The one current exception to the general inference algorithm is S3 spans, for which the spec'd `service.target.name` doesn't follow the general pattern. A new, public `span.setServiceTarget(type, name)` API has been added (this is a "SHOULD" in the spec). The new fields will replace the (now deprecated) `span.context.destination.service.*` fields. In the current stage of transition: - `destination.service.{type,name}` are set to the empty string. They are no longer used, but the intake API v2 schema up to 7.13 require them to be set. - Setting `destination.service.resource` directly is discouraged. Typically it is inferred from `service.target.*` values when a span is ended. Again the one exception is S3 spans. The not-public-but-available-because-JavaScript `span.setDestinationContext` has been deprecated (using it will `process.emitWarning()`) and replaced with an internal `span._setDestinationContext()`. As part of this change, improvements have been made to some module instrumentations: - `redis` and `ioredis`: `span.type` has changed from "cache" to "db" per spec (https://github.com/elastic/apm/blob/main/specs/agents/tracing-instrumentation-db.md#redis) - `mongodb`: `span.action` used to be "query", now it will be the mongodb command name, e.g. "find", "insert". - `mongodb` and `mongodb-core`: `span.db.instance` is now set to the database name (elastic#1494) - `mysql` and `mysql2`: `span.db.{instance,user}` are now populated. - `@elastic/elasticsearch`: The cluster name is heuristically determined for Elastic Cloud deployments and used for `db.instance`. - `sqs`: `span.destination.{address,port}` are now populated. - `pg`: `span.db.{instance,user}` are now populated. - `cassandra-driver`: the Cassandra keyspace is captured for service target data, if available. - OpenTelemetry Bridge: OTel spans with kind PRODUCER and CLIENT are now handled as exit spans (e.g. span compression could apply). Closes: elastic#2621 Closes: elastic#2822 Closes: elastic#1494 Closes: elastic#1897 Closes: elastic#2103 Obsoletes: elastic#2458

trentm self-assigned this Aug 17, 2022

github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Aug 17, 2022

trentm added 4 commits August 17, 2022 16:07

implement dropped_span_stats updates for service.target.*; fix some t…

5664a3a

…ests

updates to span-compression handling to use service.target.*

6d20938

test fixes; drop usage of deprecated and unnecessary span.setDestinat…

3cc57e3

…ionContext in tests

fix 'make check'

2bf427b

trentm added 22 commits August 18, 2022 08:44

ioredis instr updated

40b8979

dynamodb

e8113b8

s3 (module elastic/apm#674 Q7 discussion to be had)

139f9de

mongodb, mongodb-core

7cac52b

mysql

bbfb0c1

mysql2

d39a669

Merge branch 'main' into trentm/backend-granularity-sql

9cde43b

elasticsearch, @elastic/elasticsearch instrumentations

f5a3afb

http2, http instrumentation

ce260d2

fix this test for http changes for context.destination

63651b3

memcached

85ff6a3

tedious instrumentation

b08a7fa

sns instrumentation

421b011

sqs instr

66a9359

postgresql instr (also add context.db.user)

91ed6c6

fix no URL in globals in node 8

b4b3402

otel-bridge instrumentation updates

c0b42ce

undici instr updates

f3755bd

cassandra updates

e944eff

no longer need to explicitly set dest context: it'll be automatically…

bb4547e

… set for exit spans

skip this test when contextManager=patch because ES instr has limitat…

e952bd8

…ions that impact getting http response headers with this config

add testing of json-specs/service-resource-inference.test.js from apm…

1c2931e

….git

trentm mentioned this pull request Sep 9, 2022

add db.statement to 'mongodb' instrumentation #2916

Open

trentm added 3 commits September 9, 2022 12:22

clear out XXXs from mongodb instr

de8ce80

some earlier @elastic/elasticsearch@7 versions (e.g. 7.11) have diagR…

ee618f1

…esult.headers=null, cope with that

update tests for having dropped the 'span' arg to getDBDestination

d41aad2

trentm added 5 commits September 9, 2022 15:07

allow custom internal override for destination.service.resource and u…

a69cc97

…se it for the 's3' special case of no 's3/' prefix

picking away at XXXs; also fix 'make check'

4ced941

fix indentation broken by eslint-in-VSCode broken fixing

a10bccf

changelog entry; API docs and types for span.setServiceTarget()

a5e6ab0

trentm added 2 commits September 10, 2022 15:33

Merge branch 'main' into trentm/backend-granularity-sql

e6843b8

fixes to changelog after merge from main

6cd981c

trentm changed the title ~~feat: improve backend granularity for SQL-y databases~~ feat: service.target.* to improve backend granularity Sep 12, 2022

trentm added 2 commits September 12, 2022 09:30

fix for redis@2 instrumentation (missed updating this part)

94753e8

Merge branch 'main' into trentm/backend-granularity-sql

46d3975

trentm marked this pull request as ready for review September 12, 2022 18:36

trentm requested a review from astorm September 12, 2022 18:36

trentm commented Sep 12, 2022

View reviewed changes

astorm approved these changes Sep 13, 2022

View reviewed changes

changes from review feedback

593d109

trentm mentioned this pull request Sep 14, 2022

convention for marking internal API methods/attributes #2926

Closed

trentm merged commit 6826d9b into main Sep 14, 2022

trentm deleted the trentm/backend-granularity-sql branch September 14, 2022 20:59

trentm mentioned this pull request Sep 14, 2022

feat: exit spans, infer destination.service.resource #2458

Closed

10 tasks

trentm mentioned this pull request Jan 31, 2023

add span.db.link to postgres spans? #1495

Open

trentm mentioned this pull request Feb 27, 2023

Populate span.db.link/span.db.instance #1482

Closed

trentm mentioned this pull request Apr 27, 2023

[META 709] Add/simplify ES cluster name capture #3002

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: `service.target.*` to improve backend granularity #2882

feat: `service.target.*` to improve backend granularity #2882

Uh oh!

trentm commented Aug 17, 2022 •

edited

Loading

Uh oh!

apmmachine commented Aug 18, 2022 •

edited

Loading

Build stats

Test stats 🧪

Uh oh!

trentm commented Sep 9, 2022

Uh oh!

trentm commented Sep 10, 2022

Uh oh!

trentm commented Sep 11, 2022

Uh oh!

trentm Sep 12, 2022

Uh oh!

astorm left a comment

Uh oh!

astorm Sep 13, 2022

Uh oh!

trentm Sep 14, 2022

Uh oh!

astorm Sep 13, 2022

Uh oh!

trentm Sep 14, 2022

Uh oh!

astorm Sep 13, 2022

Uh oh!

trentm Sep 14, 2022 •

edited

Loading

Uh oh!

Uh oh!

feat: service.target.* to improve backend granularity #2882

feat: service.target.* to improve backend granularity #2882

Uh oh!

Conversation

trentm commented Aug 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes for reviewers

Uh oh!

apmmachine commented Aug 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💚 Build Succeeded

Build stats

Test stats 🧪

🤖 GitHub comments

Uh oh!

trentm commented Sep 9, 2022

Uh oh!

trentm commented Sep 10, 2022

Uh oh!

trentm commented Sep 11, 2022

Uh oh!

trentm Sep 12, 2022

Choose a reason for hiding this comment

Uh oh!

astorm left a comment

Choose a reason for hiding this comment

Uh oh!

astorm Sep 13, 2022

Choose a reason for hiding this comment

Uh oh!

trentm Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

astorm Sep 13, 2022

Choose a reason for hiding this comment

Uh oh!

trentm Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

astorm Sep 13, 2022

Choose a reason for hiding this comment

Uh oh!

trentm Sep 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

feat: `service.target.*` to improve backend granularity #2882

feat: `service.target.*` to improve backend granularity #2882

trentm commented Aug 17, 2022 •

edited

Loading

apmmachine commented Aug 18, 2022 •

edited

Loading

trentm Sep 14, 2022 •

edited

Loading