Closed
Description
Forked from issue #95 so we can track seperately:
@watson wrote:
Problem: Can we standardise the way tracing information is sent from an instrumented module (called module-x below) to an APM agent?
Suggestions:
- Maybe we can use the V8 trace events API
- Pros: No need for module-x to know about the APM agents
- Cons:
- Currently no way to bind the trace event to the current context
- Might be expensive to cross the barrier into C-land
- Might not be detailed enough
- Overhead of running in production?- Alternatively invent a new API in JavaScript land
- This could either be in Node core or a userland module
- If it’s a userland module, then module-x needs to detect if it’s present before it can send events to it. > The APM agents need to do the same.
- What if there’s multiple versions of this module installed?
- Research if other languages have something similar
To frame the problem a bit more concisely: Current APM vendors have to monkey-patch libraries to produce diagnostic data. E.g., if you want to know specific details of a DB query, one would monkey-patch the DB driver to capture any necessary params and stats.
This is problematic because
- monkey-patching is brittle (it necessarily relies on internal implementation details),
- each APM vendor has to effectively do the same thing.
- multiple APM vendors loaded into the same process can trample on each other.
- ESM modules wil require a custom loader to support monkey-patching.
We would like to get to a solution that has the following characteristics:
- de-facto standard APIs for producing & consuming messages. (i.e., major APM vendors should buy into this)
- ability to "phase-in" use of APIs by monkey-patching at first, and then pushing library maintainers to write to the "event sink" internally, thus eliminating needs for monkey-patching.
- API is back-compat (ideally down to Node v4)
- Ability to turn off event publication and then have "near-zero" overhead of any libraries that are leveraging publication APIs.
- a solution that accounts for asynchronous context. That is, data events necessarily need to be correlated with asynchronous context. That said, we shouldn't conflate the problem of "monkey-patching for data events" (what we're tryign to solve here) with "monkey-patching to track async continuations" (addressed by async-hooks).