-
Notifications
You must be signed in to change notification settings - Fork 180
RUST-1069 Defer implicit session check out until after connection check out #811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUST-1069 Defer implicit session check out until after connection check out #811
Conversation
} | ||
} | ||
}; | ||
return Ok(details); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be returned directly in the Ok
-handling case on 377, but this way it's clear that the loop unconditionally terminates unless there's an explicit continue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few questions. As I reviewed this I noticed some areas in this file that could use refactoring, but I don't think they're relevant to this PR so I filed RUST-1575 to address them in the future.
Ok(server) => server, | ||
Err(_) => { | ||
return Err(first_error); | ||
let retryability = self.get_retryability(&conn, &op, &session)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any chance this could fail during a retry? If so, I think we should return the original error rather than the one that occurs in this call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only error this can return is an internal one indicating an impossible/invalid connection state, which I think would be valuable to expose if it does happen :)
src/client/executor.rs
Outdated
return Err(err); | ||
// If the current transaction has been committed/aborted and it is not being | ||
// re-committed/re-aborted, reset the transaction's state to TransactionState::None. | ||
if let Some(ref mut session) = session { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved outside of the loop? I don't think these conditions would change on a retry: the session and operation should each stay the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, done.
if session.is_none() { | ||
implicit_session = self.start_implicit_session(&op).await?; | ||
session = implicit_session.as_mut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC here an implicit session will be started only on the first attempt and reused in subsequent retries. Is this the right behavior? I'm not sure if we should be dropping/re-creating the implicit session after checking out a connection each time around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is correct - the goal of the upstream change was to minimize the number of sessions created serverside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the retryable writes spec actually requires that the retry attempt use the same session as the initial operation, so this is definitely right, and I assume there are some spec tests that would start failing if we didn't preserve this behavior.
per spec:
When retrying a write command, drivers MUST resend the command with the same transaction ID.
where "transaction ID" is defined as:
the top-level lsid and txnNumber fields.
and per the sessions spec:
The retryable writes spec requires that both the original and retry attempt use the same server session. The server will block the retry attempt until the initial attempt completes at which point the retry attempt will continue executing.
For retryable reads that use an implicit session, drivers could choose to use a new server session for the retry attempt however this would lose the information that these two reads are related.
let selection_criteria = session | ||
.as_ref() | ||
.and_then(|s| s.transaction.pinned_mongos()) | ||
.or_else(|| op.selection_criteria()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto above, does this need to be done in the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not - this (potentially) borrows from op
, so moving it outside the loop would block the call to op.update_for_retry()
(which requires a &mut).
not sure if we want another review on this -- I'll leave it up to you, Abraham, if you want to wait until Kaitlin is back |
No rush. and this is core enough that more eyes will be good :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one comment that I don't need to re-review for, LGTM otherwise.
max_lsids = std::cmp::max(max_lsids, unique.len()); | ||
lsids.clear(); | ||
} | ||
assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for future readers, I think it would be helpful to leave a comment here briefly summarizing why we used 2 rather than 1 as the prose test says
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, done.
if session.is_none() { | ||
implicit_session = self.start_implicit_session(&op).await?; | ||
session = implicit_session.as_mut(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the retryable writes spec actually requires that the retry attempt use the same session as the initial operation, so this is definitely right, and I assume there are some spec tests that would start failing if we didn't preserve this behavior.
per spec:
When retrying a write command, drivers MUST resend the command with the same transaction ID.
where "transaction ID" is defined as:
the top-level lsid and txnNumber fields.
and per the sessions spec:
The retryable writes spec requires that both the original and retry attempt use the same server session. The server will block the retry attempt until the initial attempt completes at which point the retry attempt will continue executing.
For retryable reads that use an implicit session, drivers could choose to use a new server session for the retry attempt however this would lose the information that these two reads are related.
RUST-1069
Previously, retries were handled by a distinct
execute_retry
method; this was an issue for this change because there are two retry cases (connection check out failure and operation failure), and a retry should create an implicit session in the first case but not the second. This PR changes it to have a loop insideexecute_operation_with_retry
; this makes the handling there a bit more complicated, but removes theexecute_retry
method entirely, and also collapsesExecutionOutput
intoExecutionDetails
, so I think it's a net wash on complexity. Having the loop will also make executing multiple retries easier if/when we get to CSOT.