runtime metrics config gets lost when writer is rebuilt

I've noticed a small runtime metrics issue. I am including a simple reproduction below and also a draft PR with a patch for this, just to demonstrate one way to resolve it.

The issue: In the `Datadog.configure` block, when configuring `runtime_metrics`, if a call to configure the tracer host or port (`c.tracer port: ...`) occurs _after_ the call to `c.runtime_metrics`, any runtime metrics configuration will get lost and the trace writer will use the default statsd client and configuration. What it appears is happening is that all that runtime metrics configuration is getting attached to the `writer` itself but, for reasons i'm not entirely clear on but appear to be pretty complex and an active area of development right now, that writer gets dropped and a new one gets built/rebuilt when `c.tracer` is called with certain configuration options.

There's a few ways users can be impacted by this, because it's not documented anywhere that the ordering of `c.runtime_metrics` within the `configure` block matters. One way is that if a user passes in their own statsd client with their own namespace, that namespace will get lost when the writer is dropped from `c.tracer` kicking off a rebuild.  Another is that if a user is passing in their own statsd client `c.runtime_metrics statsd: <statsd_client_used_elsewhere_in_app>` , then if that statsd client needs to point to a port different than 8125 or non-local host, then that config will get lost and runtime metrics as a feature won't work. Or, it'll technically work just fine, but just fire a bunch of udp packets off into space to a host/port that isn't listening for them. Needless to say since it's UDP it was quite difficult to debug this behavior.

In the PR I'm making i've included an, admittedly, pretty hacky patch. It just shoves the existing runtime metrics config into the `writer_options` when rebuilding the writer, if that writer's options don't contain any runtime metrics specific options already. I think it works fine though it could use a unit test, but I also realize that y'all are already working on areas that touch this, so didn't want to go too far down the rabbit hole here if this will just sort've solve itself from the internal refactoring/simplification of the trace writer. It's also possible there's a cleaner way to patch this, what I attached was just the easiest thing I could get working.

An alternative (or short term remediation step) to this would be to simply document the constraints of runtime_metrics configuration at present, which is `c.runtime_metrics` needs to be called last in the `Datadog.configure` block, but ideally the ordering in the configure block shouldn't really matter and it would appear this isn't intentional.

I've included a little sample reproduction minitest suite here: https://gist.github.com/ericmustin/1d1c228053078923e7f9b19c0add8ce7

Draft PR here: https://github.com/DataDog/dd-trace-rb/pull/968

Sorry for the opposite of succinct writing, thanks for taking the time, cheers!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime metrics config gets lost when writer is rebuilt #967

ericmustin
openedon Mar 8, 2020

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

runtime metrics config gets lost when writer is rebuilt #967

Description

ericmustinopenedon Mar 8, 2020

Metadata