Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(instrumentation-redis-4): fix multi.exec() instrumentation for redis >=4.6.12 #1904

Merged
merged 6 commits into from
Jan 22, 2024

Conversation

trentm
Copy link
Contributor

@trentm trentm commented Jan 16, 2024

In redis@4.6.12 the behaviour of multi.exec() changed to throw a
MultiErrorReply if any of the commands errored out. The individual
command replies are available at 'err.replies', instead of as the
promise result. This adjusts the instrumentation to generate
spans as before: only setting SpanStatusCode.ERROR and calling
span.recordException for the individual commands that failed.

Fixes: #1874

…dis >=4.6.12

In redis@4.6.12 the behaviour of multi.exec() changed to *throw* a
MultiErrorReply if any of the commands errored out. The individual
command replies are available at 'err.replies', instead of as the
promise result. This adjusts the instrumentation to generate
spans as before: only setting SpanStatusCode.ERROR and calling
span.recordException for the individual commands that failed.

Fixes: open-telemetry#1874
@trentm trentm self-assigned this Jan 16, 2024
@trentm trentm requested a review from a team January 16, 2024 22:33
@github-actions github-actions bot requested a review from blumamir January 16, 2024 22:33
@trentm
Copy link
Contributor Author

trentm commented Jan 16, 2024

Some details, in case it helps review. The "multi command with error" test is doing this:

await client.multi()
  .set('key', 'value')   // this should be OK
  .incr('key')             // this should result in an Error reply
  .exec();

Before redis@4.6.12, the generated spans would be something like:

{
  traceId: '009bd5be404e9b0a0f98b49e8ce79b86',
  parentId: undefined,
  traceState: undefined,
  name: 'redis-SET',
  id: '1a34ecc9c5e32c34',
  kind: 2,
  timestamp: 1705439176141000,
  duration: 3764.741,
  attributes: {
    'db.system': 'redis',
    'db.statement': 'SET key [1 other arguments]'
  },
  status: { code: 0 },
  events: [],
  links: []
}
{
  traceId: 'ff93117e5b750b8da53f9c03b39a749b',
  parentId: undefined,
  traceState: undefined,
  name: 'redis-INCR',
  id: '22ba354362000f98',
  kind: 2,
  timestamp: 1705439176142000,
  duration: 3710.384,
  attributes: { 'db.system': 'redis', 'db.statement': 'INCR key' },
  status: { code: 2, message: 'ERR value is not an integer or out of range' },
  events: [
    {
      name: 'exception',
      attributes: {
        'exception.type': 'Error',
        'exception.message': 'ERR value is not an integer or out of range'
      },
      time: [ 1705439176, 145682693 ],
      droppedAttributesCount: 0
    }
  ],
  links: []
}

Note that the first span status is unset (i.e. it is okay). With redis@4.6.12 and before the changes in this PR, the generated spans would be:

{
  traceId: '819de161bcbd0936d67499d691abd1f6',
  parentId: undefined,
  traceState: undefined,
  name: 'redis-SET',
  id: '970df662a607df93',
  kind: 2,
  timestamp: 1705439639719000,
  duration: 2616.091,
  attributes: {
    'db.system': 'redis',
    'db.statement': 'SET key [1 other arguments]'
  },
  status: {
    code: 2,
    message: '1 commands failed, see .replies and .errorIndexes for more information'
  },
  events: [
    {
      name: 'exception',
      attributes: {
        'exception.type': 'Error',
        'exception.message': '1 commands failed, see .replies and .errorIndexes for more information'
      },
      time: [ 1705439639, 721568784 ],
      droppedAttributesCount: 0
    }
  ],
  links: []
}
{
  traceId: 'c762a1e106ca7723cc322a1bbf0072eb',
  parentId: undefined,
  traceState: undefined,
  name: 'redis-INCR',
  id: '67f43040500d934c',
  kind: 2,
  timestamp: 1705439639720000,
  duration: 2736.574,
  attributes: { 'db.system': 'redis', 'db.statement': 'INCR key' },
  status: {
    code: 2,
    message: '1 commands failed, see .replies and .errorIndexes for more information'
  },
  events: [
    {
      name: 'exception',
      attributes: {
        'exception.type': 'Error',
        'exception.message': '1 commands failed, see .replies and .errorIndexes for more information'
      },
      time: [ 1705439639, 722721946 ],
      droppedAttributesCount: 0
    }
  ],
  links: []
}

The MultiErrorReply thrown from multi.exec() would get applied to both spans, even though the first one was fine. A dumped MultiErrorReply looks like this:

[MultiErrorReply: 1 commands failed, see .replies and .errorIndexes for more information] {
  replies: [ 'OK', [ErrorReply: ERR value is not an integer or out of range] ],
  errorIndexes: [ 1 ]
}

Copy link

codecov bot commented Jan 16, 2024

Codecov Report

Merging #1904 (eaf6187) into main (f65f2f1) will increase coverage by 4.54%.
The diff coverage is n/a.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1904      +/-   ##
==========================================
+ Coverage   91.51%   96.06%   +4.54%     
==========================================
  Files         145       14     -131     
  Lines        7427      914    -6513     
  Branches     1486      199    -1287     
==========================================
- Hits         6797      878    -5919     
+ Misses        630       36     -594     

see 131 files with indirect coverage changes

@trentm
Copy link
Contributor Author

trentm commented Jan 16, 2024

will decrease coverage by 4.57%.

The decrease in coverage is because it doesn't consider the if-block in the regular test. That block is tested in TAV tests, but codecov doesn't include results from those runs, I gather.

Copy link
Member

@dyladan dyladan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a few nits but fine if you don't agree and would rather keep what you have

return redisRes;
})
.catch((err: Error) => {
const openSpans = this[OTEL_OPEN_SPANS];
if (!openSpans) {
return plugin._diag.error(
plugin._diag.error(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: One thing to note with this change here (and effectively as well with the move of the return plugin._diag.error(...) calls above into _endSpansWithRedisReplies) is that this execPatch will now no longer return with the return value of _diag.error() in this edge case. I think that possibility was a bug before, albeit a bug that was probably never hit. It would have changed the behaviour of the user's redisClient.multi()...exec().

@trentm trentm merged commit fce7d3b into open-telemetry:main Jan 22, 2024
17 checks passed
@trentm trentm deleted the tm-fix-redis-4-tav branch January 22, 2024 21:41
@dyladan dyladan mentioned this pull request Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

instrumentation-redis-4 tests fail with redis@4.6.12
4 participants