-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix duplicate retry log messages #1108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
JAVA-4910
? format("Retrying the operation due to the error \"%s\"; attempt #%d", exception, oneBasedAttempt) | ||
: format("Retrying the operation '%s' due to the error \"%s\"; attempt #%d", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt "Retrying the operation ..." is grammatically correct in a situation when we don't specify what operation is being retried, but having the same longer prefix for two messages seems better.
.attach(AttachmentKeys.commandDescriptionSupplier(), command::getFirstKey, false) | ||
.attach(AttachmentKeys.command(), command, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why the command description for writes was auto-cleared, but given that the command was not auto-cleared, auto-clearing the description did not make sense.
.attach(AttachmentKeys.commandDescriptionSupplier(), command::getFirstKey, false) | ||
.attach(AttachmentKeys.command(), command, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to #1108 (comment).
Are you saying that this is the result of the refactoring, or this it's the result of the required retry algorithm, regardless of the implementation strategy. I think you mean the former, but want to confirm before proceeding. |
@jyemin, I added examples and more explanations to the description of the PR. |
It would be more useful if the driver emits the retry log message with information about the command. Given that currently the second log message does have the command available, that at least seems like it's possible, even if it results in a more complicated implementation. What do you think? |
There are two issues with such an approach in addition to returning the sprinkled logging calls all over the place, which is why I did not choose this path:
The second item is on its own worse than what we have with the current PR, but combined with the first issue, it's way worse. If we want our cake and eat it too, we should get back to the situation when we emit two log messages per a retry attempt, but we change the wording in the messages to something like:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The code that logs retries is now added by the
decorate*WithRetries
methods. Sometimes, this may result in us not logging even the short description of the command that is being retried, because currently we know the command (we create it) only after checking out a connection and checking if the server supports retries. So if the first attempt fails at a point before we create the command (e.g., because of theMongoConnectionPoolClearedException
), we don't doretryState.attach(AttachmentKeys.commandDescriptionSupplier(), command::getFirstKey, false)
, and the retrying attempt does not have the information about the command that it retries. This situation will be improved when Log operation ID when retrying is done.Log messages examples
Before the PR
For each retry attempt we emitted two corresponding log messages. It looked either like
or like
Note how in the second example the first message does not provide a description (
aggregate
) of operation that was being retried. This is because when a retry attempt starts, it may have no information on what command is bing retried, if the first attempt failed too soon to create the command and leave its description for the retry attempt to use.The first of the two messages in each example is from the start of a retry attempt, the second message is from a point at which the retry attempt is guaranteed to know what is being retried (this point is somewhere in the middle of a retry attempt).
After the PR
For each retry attempt we emit one corresponding log message. It looks either like
or, if the information on the retrying command is not yet available
The message comes from the start of a retry attempt.
So now we don't duplicate retry messages, and don't log them from the middle of a retry attempt, but as a result, we may sometimes not log the description of the retried command. Logging operation ID will allow users to correlate this messages with previous log messages to still see what is being retried, which is why I filed Log operation ID when retrying. Alternatively, we may refactor our code to propagate the description of the retried command from the higher API level (the one that is closer to the user), but I don't think we should do that.
JAVA-4910