feat(instrumentation-pino): add log sending to Logs Bridge API #2249

trentm · 2024-05-30T22:03:31Z

This adds "log sending" -- sending log records to the Logs SDK -- for the pino logger.
(The same was done for Bunyan in #1713 and Winston in #1837.)

When a Pino logger is created, this wraps the logger instance's output stream with a pino.multistream that sends to the original and to the Logs SDK (class OTelPinoStream).
This adds disableLogSending and disableLogCorrelation options, the same as in instr-bunyan and instr-winston

Limitations:

This does not yet add support for the logSeverity option. I prefer to do that in a separate change, because it might be involved with Pino loggers configured with custom levels.
This does not yet test log sending in the ESM tests. I'll add that separately, because it requires some updates to the contrib-test-utils package.

reviewer notes

The test file changes on this PR assume that #2247 gets merged first. Please review that PR first. After that is merged, then the test file changes here will be additive and more clear.
Update (2024-06-10): This is done now, so this PR is now ready for review.

checklist

passing CI
TAV tests
discuss pino-opentelemetry-transport alternative and trade-offs
separate issue for logSeverity
separate issue for log-sending tests with ESM
consider a separate issue for translating the default Pino errSerializer format to https://github.com/open-telemetry/semantic-conventions/blob/main/docs/exceptions/exceptions-logs.md (ditto for Bunyan and Winston?)

…x; some in-progress changes

…hange behaviour for that logger This removes functionality that was there before, so technically could be breaking. The motivation is to have the contract of pino instrumentation be cleaner. With this change a pino Logger instance will not be touched if the PinoInstrumentation is disabled. I.e. we are hands-off as much as possible when disabled. Before this change, even when disabled, the instrumentation would tweak the pino Logger config to have a no-op mixin. If the instrumentation was later enabled, then the mixin would become active (adding trace_id et al fields in a span context). The coming "log sending" to the Logs Bridge API will *not* support this "work if instrumentation is re-enabled later", so I think it is clearer if neither "log sending" nor "log correlation" support this. We can back this out if we think it is important to support a possible future feature of the SDK doing live enabling/disabling of individual instrumentations.

…og-correlation *will* follow the live instr enable/disable state

codecov · 2024-05-30T22:13:32Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.42%. Comparing base (dfb2dff) to head (1bb7260).
Report is 176 commits behind head on main.

❗ Current head 1bb7260 differs from pull request most recent head 632e5c9

Please upload reports for the commit 632e5c9 to get more accurate results.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2249      +/-   ##
==========================================
- Coverage   90.97%   90.42%   -0.56%     
==========================================
  Files         146      148       +2     
  Lines        7492     7322     -170     
  Branches     1502     1518      +16     
==========================================
- Hits         6816     6621     -195     
- Misses        676      701      +25

Files	Coverage Δ
...emetry-instrumentation-pino/src/instrumentation.ts	`100.00% <100.00%> (+4.00%)`	⬆️
...etry-instrumentation-pino/src/log-sending-utils.ts	`100.00% <100.00%> (ø)`

... and 56 files with indirect coverage changes

…the first half-second since process start Effectively the issue is that this sometimes returns true: node -e 'console.log(Math.round(Date.now() / 1e3) * 1e3 < performance.timeOrigin)'

…tream was added

trentm · 2024-05-30T23:31:12Z

plugins/node/opentelemetry-instrumentation-pino/src/log-sending-utils.ts

+import { logs, Logger, SeverityNumber } from '@opentelemetry/api-logs';
+import { PACKAGE_NAME, PACKAGE_VERSION } from './version';
+import { millisToHrTime } from '@opentelemetry/core';
+


Reviewer note: From here down to function severityNumberFromPinoLevel is basically identical to the equivalent level/severity mapping in instrumentation-bunyan, because Bunyan and Pino share the same level names and values.

plugins/node/opentelemetry-instrumentation-pino/src/instrumentation.ts

david-luna · 2024-06-12T13:33:03Z

plugins/node/opentelemetry-instrumentation-pino/README.md


 ## Semantic Conventions

 This package does not currently generate any attributes from semantic conventions.

+## Alternative log sending with `pino-opentelemetry-transport`


I wonder if the instrumentation would be capable to know if pino-opentelemetry-transport is used when creating a new logger and disable sending in the main thread. Or maybe warn the user if this transport is detected and suggest to change its disableSending configuration if applies.

First, I don't believe there is a way for instrumentation-pino to know if pino-opentelemetry-transport is being used. Here is an example script:

const otel = require('@opentelemetry/api'); const pino = require('pino'); const transport = pino.transport({ target: 'pino-opentelemetry-transport' }); const logger = pino({level: 'debug'}, transport); console.log('XXX The logger stream is:', logger[pino.symbols.streamSym]); logger.debug({foo: 'bar'}, 'hi'); const tracer = otel.trace.getTracer('example'); tracer.startActiveSpan('manual-span', span => { logger.info('in a span'); span.end(); });

If instrumentation-pino is active, the only thing it has to look at is that logger[pino.symbols.streamSym] value -- the destination stream of the Pino Logger. This is the thing that the instrumentation calls origStream and wraps in a multistream.

That thing is a ThreadStream (from https://github.com/pinojs/thread-stream/blob/main/index.js):

XXX The logger stream is: <ref *1> ThreadStream { _events: [Object: null prototype] { message: [Function (anonymous)], ready: [Function: onReady], close: [Function (anonymous)] }, _eventsCount: 3, _maxListeners: undefined, worker: Worker { _events: [Object: null prototype] { newListener: [Function (anonymous)], removeListener: [Function (anonymous)], message: [Function: onWorkerMessage], exit: [Function: onWorkerExit] }, _eventsCount: 4, _maxListeners: undefined, performance: { eventLoopUtilization: [Function: bound eventLoopUtilization] }, stream: FakeWeakRef { _value: [Circular *1] }, [Symbol(kCapture)]: false, ...

That ThreadStream provides no way to know what the target package or Node.js file is being run in the Worker thread. In other words, instr-pino can know that a Pino transport is being used (i.e. something running in a separate thread), but has no idea whether that is pino-opentelemetry-transport.

I think it would be overkill to look into instrumenting pino.transport() to tag any returned transport that is using pino-opentelemetry-transport to try to tell.

Second, I'm not sure that we'd want instr-pino to automatically do anything even if it did know.

Automatically disabling log-sending if pino-opentelemetry-transport (POT) was detected would potentially be surprising behaviour. I think to be explicit we'd then want disableLogSending to be a tri-state config: true, false, "true-if-POT-detected".

Automatically warning if POT was detected? I suppose this would be fair. Most users would never want both active on a given Pino logger. (Eventually perhaps we'd want the ability for users to silence particular warnings from the OTel SDK that they don't care above -- something like Python's warnings filter https://docs.python.org/3/library/warnings.html#the-warnings-filter)

These are why I opted to just have documentation about this so that users are at least made aware of the options (assuming they get as far as reading this README). Perhaps a future "Troubleshooting" guide would merit a section on "I'm using Pino and getting double logs. Why?!".

Those are valid points. Thanks for taking the time to give your reasoning :)

trentm · 2024-06-19T20:25:07Z

@hectorhdzg Would you be interested in reviewing this as well? I'm not sure if Pino is as important to some of the work you are doing as Winston.

plugins/node/opentelemetry-instrumentation-pino/README.md

hectorhdzg · 2024-06-20T23:16:43Z

@trentm changes look good, my only question is related to pino-opentelemetry-transport usage , sorry if this was discussed somewhere else, why not using this package and patching the logger construction similar to Winston? is this related to having older pino version support?

Co-authored-by: Hector Hernandez <39923391+hectorhdzg@users.noreply.github.com>

trentm · 2024-06-24T23:49:39Z

my only question is related to pino-opentelemetry-transport usage , sorry if this was discussed somewhere else, why not using this package and patching the logger construction similar to Winston? is this related to having older pino version support?

@hectorhdzg It hasn't been discussed elsewhere, so it is a good question.

The decision to not just suggest users use pino-opentelemetry-transport (POT) for log sending was not about having older pino version support. The log-sending support implemented here requires pino@7 or later. The current POT requires pino@8 or later, so I guess there is a difference in Pino versions supported. However, that wasn't a reason for my choice. (Pino 7 isn't being maintained anymore -- last release was in 2022 -- so there isn't much value is in support it with a new feature.)

why not using this package and patching the logger construction similar to Winston?

Please correct me if I miss something, but I see this as two separate questions.

instrumentation no longer patches the logger constructor args

First, why I moved away from patching the logger constructor to instead creating the Logger instance and then tweaking it.

I found that coping with all the possible input combinations to pino(...) and patching them to a form that does what we need was difficult. https://getpino.io/#/docs/api?id=destination describes a lot of possible inputs:

destination (Number | String | Object | DestinationStream | SonicBoomOpts | WritableStream)

This is the Pino internal code for normalizing the given pino() args: https://github.com/pinojs/pino/blob/a41b4d3c409b31529c8bcb4bf3cd4e016da8a8de/lib/tools.js#L290-L340
It has been a few weeks, but IIRC, because that buildSafeSonicBoom is internal I'm not sure that it is possible to accurately patch the input arguments and never change the semantics. As a minor example, the current instrumentation-pino throws on pino(2) -- which granted is an invalid pino() argument, but the instrumentation shouldn't break usage:

% cat sadpino.js
const pino = require('pino')
const log = pino(2)
log.info('hi')

% node sadpino.js
{"level":30,"time":1719270084687,"pid":83775,"hostname":"peach.local","msg":"hi"}

% node -r @opentelemetry/auto-instrumentations-node/register sadpino.js
OTEL_TRACES_EXPORTER is empty. Using default otlp exporter.
OpenTelemetry automatic instrumentation started successfully
/Users/trentm/tmp/asdf.20240624T155854/node_modules/@opentelemetry/instrumentation-pino/build/src/instrumentation.js:122
            options.mixin = this._getMixinFunction();
                          ^

TypeError: Cannot create property 'mixin' on number '2'
    at PinoInstrumentation._combineOptions (/Users/trentm/tmp/asdf.20240624T155854/node_modules/@opentelemetry/instrumentation-pino/build/src/instrumentation.js:122:27)
    at /Users/trentm/tmp/asdf.20240624T155854/node_modules/@opentelemetry/instrumentation-pino/build/src/instrumentation.js:55:47
    at Object.<anonymous> (/Users/trentm/tmp/asdf.20240624T155854/sadpino.js:2:13)
    at Module._compile (node:internal/modules/cjs/loader:1256:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1310:10)
    at Module.load (node:internal/modules/cjs/loader:1119:32)
    at Module._load (node:internal/modules/cjs/loader:960:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:86:12)
    at node:internal/main/run_main_module:23:47

Node.js v18.18.2

After the pino logger is created, it has a well-defined shape that can be patched cleanly.

why not just use 'pino-opentelemetry-transport'?

This could either mean that we get instrumentation-pino to automatically use
'pino-opentelemetry-transport' (POT), or just document to users that they should
use POT for log-sending with Pino.

On the "for using pino-opentelemetry-transport" side:

It is maintained by the Pino community. In addition to being associated with Pino, it also would theoretically mean less maintenance for OTel JS. Both are positives.
The design of using a worker thread means that in most situations (environments that aren't constrained to a single CPU), there will be less JSON.parse and OTLP serialization CPU work on the main thread.

My arguments for implementing it without POT are:

Architecturally, POT differs a lot from our other instrumentations. It runs code in a separate worker-thread. AFAIK no other OTel instrumentations do that. In depends on (https://gist.github.com/trentm/95c1821bf84335eac0ba8c9a5dbf948d) and starts its own OTel SDK components. I think the dependencies alone rule it out as something that we could include by default in auto-instrumentations-node.
Also on the architectural difference: There is a lot more complexity with POT that I worry would lead to more support effort. It starts its own OTel SDK components in the separate thread. The current usage of node -r @opentelemetry/auto-instrumentations-node/register ... will also start an SDK in any worker threads, because -r ... preload modules are passed to worker threads unless the code starting the worker does something special. Granted this is an issue orthogonal to Pino transport usage, that we should discuss separately. (E.g. Should the auto-instrumentations-node/register entry point have a guard to only run in the main thread?)
A lesser complexity: For POT to include traceId/spanId in LogRecords, it relies on the trace_id / span_id fields to be on the log record. The implementation in this PR does not. The fairly-recently added logKeys option to instrumentation-pino allows users to change the log fields to myTraceId or whatever. This will break having span context fields in LogRecords in POT.
This is perhaps more my personal opinion: I'm not convinced that most users of a logger need the more complex design of a sending on a separate thread to free up CPU cycles on the main thread.
Assuming that we cannot include POT in instrumentation-pino by default, then the current implementation (not using POT) offers the benefit of being something that can be enabled by default -- for "zero-code" instrumentation. Granted we could get instrumentation-pino to "do log-sending only if require('POT') works" -- but I think that is subtle behaviour that is best avoided if possible.

Downsides of adding log-sending support to instrumentation-pino:

It adds some potential confusion for users wanting to use Pino and OTel. They may need to now understand both options and choose between them, or ensure they work together.

Sorry, that got long.

hectorhdzg

LGTM, @trentm I appreciate your detailed response on my latest question.

trentm added 13 commits May 27, 2024 10:21

feat(instrumentation-pino): add log sending to Logs Bridge API

c8b6072

Merge branch 'main' into tm-pino-logs-bridge

211b198

refactor tests (mostly from separate open-telemetry#2247 PR); lint:fi…

d817eb0

…x; some in-progress changes

remove some old dev/debug code

72c086b

impl disableLogCorrelation config; undo the previous commit so that l…

26256d4

…og-correlation *will* follow the live instr enable/disable state

log sending: first tests; change impl to use pino.multistream

59baad8

edge case tests; almost complete

5e0123d

more tests and a fix for 'useOnlyCustomLevels: true' usage

2db3a8c

lint:fix

fe0a655

refactor some code out to utils file

0201e80

add some internal docs

aa2e224

update readme

ce12413

trentm self-assigned this May 30, 2024

github-actions bot added the pkg:instrumentation-pino label May 30, 2024

github-actions bot assigned seemk May 30, 2024

github-actions bot requested a review from seemk May 30, 2024 22:03

trentm added 2 commits May 30, 2024 15:05

fix lint

2a630f2

Merge branch 'main' into tm-pino-logs-bridge

2abd655

trentm added 4 commits May 30, 2024 15:47

avoid a possible flaky error if using pino 'unixTime' and logging in …

be12617

…the first half-second since process start Effectively the issue is that this sometimes returns true: node -e 'console.log(Math.round(Date.now() / 1e3) * 1e3 < performance.timeOrigin)'

limit log-sending to pino@7 and later because that's when pino.multis…

899f153

…tream was added

lint:fix

77b1877

discuss pino-opentelemetry-transport alternative

37a59de

trentm marked this pull request as ready for review May 31, 2024 00:12

trentm requested a review from a team May 31, 2024 00:12

trentm commented May 31, 2024

View reviewed changes

trentm added 3 commits June 7, 2024 12:05

Merge branch 'main' into tm-pino-logs-bridge

158fd6a

fix a mis-merge

f1b83a8

update changed deps to their new latest

954642a

Merge branch 'main' into tm-pino-logs-bridge

3d1c5da

david-luna reviewed Jun 12, 2024

View reviewed changes

david-luna approved these changes Jun 13, 2024

View reviewed changes

trentm mentioned this pull request Jun 19, 2024

[instrumentation-winston] May duplicate transport (and logs) without notice #2283

Closed

hectorhdzg reviewed Jun 20, 2024

View reviewed changes

plugins/node/opentelemetry-instrumentation-pino/README.md Outdated Show resolved Hide resolved

trentm and others added 2 commits June 24, 2024 14:23

typo in README

7116e6a

Co-authored-by: Hector Hernandez <39923391+hectorhdzg@users.noreply.github.com>

Merge branch 'main' into tm-pino-logs-bridge

1bb7260

hectorhdzg approved these changes Jun 26, 2024

View reviewed changes

Merge branch 'main' into tm-pino-logs-bridge

632e5c9

pichlermarc merged commit 055ef41 into open-telemetry:main Jun 27, 2024
17 checks passed

dyladan mentioned this pull request Jun 27, 2024

chore: release main #2275

Merged

trentm deleted the tm-pino-logs-bridge branch June 28, 2024 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(instrumentation-pino): add log sending to Logs Bridge API #2249

feat(instrumentation-pino): add log sending to Logs Bridge API #2249

trentm commented May 30, 2024 •

edited

Loading

codecov bot commented May 30, 2024 •

edited

Loading

trentm May 30, 2024

david-luna Jun 12, 2024

trentm Jun 13, 2024 •

edited

Loading

david-luna Jun 18, 2024

trentm commented Jun 19, 2024

hectorhdzg commented Jun 20, 2024

trentm commented Jun 24, 2024 •

edited

Loading

hectorhdzg left a comment

feat(instrumentation-pino): add log sending to Logs Bridge API #2249

feat(instrumentation-pino): add log sending to Logs Bridge API #2249

Conversation

trentm commented May 30, 2024 • edited Loading

reviewer notes

checklist

codecov bot commented May 30, 2024 • edited Loading

Codecov Report

trentm May 30, 2024

Choose a reason for hiding this comment

david-luna Jun 12, 2024

Choose a reason for hiding this comment

trentm Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

david-luna Jun 18, 2024

Choose a reason for hiding this comment

trentm commented Jun 19, 2024

hectorhdzg commented Jun 20, 2024

trentm commented Jun 24, 2024 • edited Loading

instrumentation no longer patches the logger constructor args

why not just use 'pino-opentelemetry-transport'?

hectorhdzg left a comment

Choose a reason for hiding this comment

trentm commented May 30, 2024 •

edited

Loading

codecov bot commented May 30, 2024 •

edited

Loading

trentm Jun 13, 2024 •

edited

Loading

trentm commented Jun 24, 2024 •

edited

Loading