Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric ExportResult other than SUCCESS causes UnhandledPromiseRejectionWarning: undefined #1561

Closed
aabmass opened this issue Oct 1, 2020 · 15 comments
Labels
bug Something isn't working

Comments

@aabmass
Copy link
Member

aabmass commented Oct 1, 2020

I was seeing UnhandledPromiseRejectionWarning: undefined warnings when using a metric exporter. Running node with --trace-warnings gives a little more information but not a real trace unfortunately:

(node:2394191) UnhandledPromiseRejectionWarning: undefined
    at emitWarning (internal/process/promises.js:99:15)
    at emitPromiseRejectionWarnings (internal/process/promises.js:143:7)
    at process._tickCallback (internal/process/next_tick.js:69:34)
(node:2394191) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:2394191) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
    at emitDeprecationWarning (internal/process/promises.js:111:13)
    at emitWarning (internal/process/promises.js:104:3)
    at emitPromiseRejectionWarnings (internal/process/promises.js:143:7)
    at process._tickCallback (internal/process/next_tick.js:69:34)

Eventually, I found the warning to be coming from exporters giving status of ExportResult.FAILED_* (anything other than SUCCESS). PushController just reject()s the promise here:

return new Promise((resolve, reject) => {
this._exporter.export(
this._meter.getBatcher().checkPointSet(),
result => {
if (result === ExportResult.SUCCESS) {
resolve();
} else {
// @todo log error
reject();
}
}
);
});

This code is running in a setInterval() so never gets awaited. The fix is to address the @todo and log instead of reject().

@aabmass aabmass added the bug Something isn't working label Oct 1, 2020
@vmarchaud
Copy link
Member

This will be fixed once we merge #1514

@aabmass
Copy link
Member Author

aabmass commented Oct 1, 2020

I'm not sure that will fix it after looking at the PR. Without actually adding a catch() for the above, the only way to surface the error is with a handler for unhandled promises.

The reject() also should get a message or exception, right now all you get is undefined which isn't too helpful. Also, the linked code does not actually retry ExportResult .FAILED_RETRYABLE results.

@vmarchaud
Copy link
Member

@aabmass Not in the current state i agree but this promise should be catched then sent to the global error handler (which the PR i mentionned implement)

@mwear
Copy link
Member

mwear commented Oct 2, 2020

Hello. Author of #1514 here. I just wanted to check in to see what the best way to solve this issue is. As you point out @aabmass, there isn't really a handle on an error to log in the push controller. It seems like the error handling needs to come a level further up, like in the export method of the given exporter. Out of curiosity, are you using the collector exporter, or prometheus exporter (or something else)?

@aabmass
Copy link
Member Author

aabmass commented Oct 2, 2020

@mwear I'm maintaining the Google Cloud Monitoring exporter. This comes up from here in the exporter.

We sometimes want to retry so we call the callback with ExportResult.FAILED_RETRYABLE, although I see this isn't implemented yet in the SDK. The error logging could happen further up, but I guess the SDK would still have to handle retry logic or do something else in the case of Exporter.FAILED

@mwear
Copy link
Member

mwear commented Oct 5, 2020

Thanks for the information @aabmass. The reason I ask is that in #1514 the recommendation is to add a catch and use the global error handler in the PushController here:

return new Promise((resolve, reject) => {
this._exporter.export(
this._meter.getBatcher().checkPointSet(),
result => {
if (result === ExportResult.SUCCESS) {
resolve();
} else {
// @todo log error
reject();
}
}
);
});

The code path in the PushController rejects the promise based on the result of the resultCallBack passed to the export method.

The collector exporter uses the globalErrorHandler one level above in the export method, before calling the resultCallBack.

export(items: ExportItem[], resultCallback: (result: ExportResult) => void) {
if (this._isShutdown) {
resultCallback(ExportResult.FAILED_NOT_RETRYABLE);
return;
}
this._export(items)
.then(() => {
resultCallback(ExportResult.SUCCESS);
})
.catch((error: ExportServiceError) => {
globalErrorHandler(error);
if (error.code && error.code < 500) {
resultCallback(ExportResult.FAILED_NOT_RETRYABLE);
} else {
resultCallback(ExportResult.FAILED_RETRYABLE);
}
});
}

Since your problem is due to the export result passed to the callback and not an unhandled error, I don't think a catch will help in the PushController. I'm not sure how I can help in #1514, but I am open to any suggestions if you have.

@vmarchaud
Copy link
Member

@mwear Isn't it a problem with the spec then ? In the case of a FAILED_NOT_RETRYABLE i think the global error handler should be used which implies having the error in the resultCallback, the spec currently allow only a "result" that can be either a success or a failure.

Note: the spec declare that any retry logic should be handled by the exporter so the result FAILED_RETRYABLE isn't spec compliant currently.

@mwear
Copy link
Member

mwear commented Oct 5, 2020

It looks like the go implementation has what we ultimately want. The export result has both a code and a wrapped error: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#go-spanexporter-interface. I think this is a bit of a refactor and too much for #1514, but something we can consider.

@aabmass
Copy link
Member Author

aabmass commented Oct 6, 2020

@mwear for this issue specifically I think adding a catch() in PushController and logging a static string would be a big improvement. It took me quite a while to track down where UnhandledPromiseRejectionWarning: undefined was coming from (even with --trace-warnings). I can add specific logging to the Google exporter until a wrapped error is possible.

I also noticed #987 which could be a part of the refactor you mentioned since the callback api is a little strange 🙂

@johanneswuerbach
Copy link
Contributor

I'm struggling a bit to understand why this issue isn't prioritised. Rolling out @opentelemtry/tracing 0.12.0 caused a bunch of process exits for us due to unhandled promise rejections caused by this issue.

While we use Node 14, we already exit on unhandled promise rejections as done by default on Node 15 (nodejs/TSC#916) so an unhandled rejection is causing our processes to exit. Additionally the issue is challenging to pin down as the rejection value is a plain number, so the generated stack traces aren't really valuable.

Example trace:

Error: 1
  File "/app/node_modules/raven/lib/client.js", line 409, col 15, in Raven.captureException
    err = new Error(err);
  File "/app/node_modules/REMOVED/built/lib/instrument-process.js", line 13, col 30, in sentryId
    sentryClient.captureException(error, {
  ?, in new Promise
  File "/app/node_modules/REMOVED/built/lib/instrument-process.js", line 12, col 36, in null.<anonymous>
    const sentryId = await new Promise((resolve, reject) => {
  File "/app/node_modules/REMOVED/built/lib/instrument-process.js", line 67, col 16, in process.<anonymous>
    return exitFatal(err, 'unhandledRejection');
  File "events.js", line 315, col 20, in process.emit
  File "domain.js", line 486, col 12, in process.EventEmitter.emit
  File "internal/process/promises.js", line 245, col 33, in processPromiseRejections
  File "internal/process/task_queues.js", line 94, col 32, in processTicksAndRejections

How would a fix here look like? Add .catch() to each invocation of this._flush() as done here https://github.com/open-telemetry/opentelemetry-js/blob/master/packages/opentelemetry-tracing/src/export/BatchSpanProcessor.ts#L121? Or just remove the reject https://github.com/open-telemetry/opentelemetry-js/blob/master/packages/opentelemetry-tracing/src/export/BatchSpanProcessor.ts#L110 and rely on the global error logger?

We are using the jaeger exporter, which is already logging the error itself https://github.com/open-telemetry/opentelemetry-js/blob/master/packages/opentelemetry-exporter-jaeger/src/jaeger.ts#L139-L141

@vmarchaud
Copy link
Member

@johanneswuerbach Well this should have been fined with #1514, the metric controller doesnt reject anymore: https://github.com/open-telemetry/opentelemetry-js/blob/master/packages/opentelemetry-metrics/src/export/Controller.ts#L57. Are you sure you are using the newest metrics sdk version ?

@johanneswuerbach
Copy link
Contributor

Sorry, we are hitting this issue in the tracing BatchSpanProcessor https://github.com/open-telemetry/opentelemetry-js/blob/master/packages/opentelemetry-tracing/src/export/BatchSpanProcessor.ts#L110, not in the metrics controller.

I somehow mixed this up as the error and code in question looks pretty similar, should I create a new issue?

@vmarchaud
Copy link
Member

@johanneswuerbach Yeah i think a new issue would be the way to go.

@aabmass Since the metric exporter doesnt throw anymore, are you okay with closing this issue ? We can totally open one to refactor to include the error inside the error handler as @mwear suggested

@johanneswuerbach
Copy link
Contributor

johanneswuerbach commented Oct 22, 2020

Sorry for the noise, created #1617 for the unhandled rejection issue when exporting traces .

@aabmass
Copy link
Member Author

aabmass commented Oct 22, 2020

@vmarchaud #1514 looks like it would have solved my issue 👍

@aabmass aabmass closed this as completed Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants