-
Notifications
You must be signed in to change notification settings - Fork 1.5k
JAVA-5950 Update Transactions Convenient API with exponential backoff on retries #1852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: backpressure
Are you sure you want to change the base?
JAVA-5950 Update Transactions Convenient API with exponential backoff on retries #1852
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements exponential backoff with jitter for transaction retries in MongoDB's withTransaction convenience API. The implementation adds a configurable backoff mechanism that applies delays between retry attempts when transient transaction errors occur, following the MongoDB specification with a growth factor of 1.5 for transactions.
Key Changes
- Introduces
ExponentialBackoffutility class with factory methods for transaction retries (5ms base, 500ms max, 1.5x growth) and command retries (100ms base, 10s max, 2.0x growth) - Integrates backoff logic into
ClientSessionImpl.withTransaction()to delay between retry attempts - Adjusts test configuration to verify backoff behavior with multiple retry attempts
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java | New utility class implementing exponential backoff with jitter using ThreadLocalRandom |
| driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java | Adds backoff delay before transaction retries and uses CSOT timeout when available |
| driver-core/src/test/unit/com/mongodb/internal/ExponentialBackoffTest.java | Comprehensive unit tests validating backoff calculations, growth factors, and maximum caps |
| driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java | New functional test verifying exponential backoff behavior and adjusted existing test configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
| AtomicInteger retryCount = new AtomicInteger(0); | ||
|
|
||
| session.withTransaction(() -> { | ||
| retryCount.incrementAndGet(); // Count the attempt before the operation that might fail |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test verifies the retry count but does not validate that exponential backoff delays are actually applied. Consider measuring elapsed time and asserting minimum expected delays to ensure backoff is functioning correctly. For example, with 3 retries at delays of ~5ms, ~7.5ms, and ~11.25ms, the total elapsed time should be at least the sum of minimum expected delays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExponentialBackoffTest covers these unit tests already.
|
@nhachicha Please take note of mongodb/specifications#1868 |
stIncMale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I haven't reviewed
ExponentialBackoffTest, because it depends onExponentialBackoff, where I left many suggestions. - I haven't reviewed
ClientSessionImpl, because it has to implement the new specification change DRIVERS-1934: clarify drivers back off before all transaction retries (#1868). - The last reviewed commit is 90ec4d5.
driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java
Outdated
Show resolved
Hide resolved
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
|
@stIncMale @nhachicha Flagging one more relevant spec test adjustment here: mongodb/specifications#1876 |
|
@dariakp Thank you for the heads up, I updated the PR description. |
…Impl.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…s exceeded (ex operationContext.getTimeoutContext().getReadTimeoutMS())
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
…tionProseTest.java Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
Co-authored-by: Valentin Kovalenko <valentin.male.kovalenko@gmail.com>
5c2145c to
36ecbf9
Compare
stIncMale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a partial review, where I reviewed only ExponentialBackoff.java.
The last reviewed commit is 36ecbf9.
driver-core/src/main/com/mongodb/internal/time/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/time/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/time/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/time/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/time/ExponentialBackoff.java
Outdated
Show resolved
Hide resolved
stIncMale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a continuation of #1852 (review).
This is a partial review, where I reviewed only ClientSessionImpl.java in addition to ExponentialBackoff.java.
The last reviewed commit is 36ecbf9.
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java
Outdated
Show resolved
Hide resolved
driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java
Outdated
Show resolved
Hide resolved
driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java
Outdated
Show resolved
Hide resolved
…mmediate retry without backoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last reviewed commit is c97073e.
The test changes have not been re-reviewed. I'll re-review them soon:
ExponentialBackoffTestAbstractClientSideOperationsTimeoutProseTestWithTransactionProseTest.testRetryBackoffIsEnforced/testExponentialBackoffOnTransientError
stIncMale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last reviewed commit is f0973a1.
I reviewed everything except for AbstractClientSideOperationsTimeoutProseTest.java.
| import static org.junit.jupiter.api.Assertions.assertEquals; | ||
| import static org.junit.jupiter.api.Assertions.assertTrue; | ||
|
|
||
| public class ExponentialBackoffTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public class ExponentialBackoffTest { | |
| class ExponentialBackoffTest { |
| assertTrue(backoff >= 0 && backoff <= Math.round(expectedMaxValues[attemptNumber - 1]), | ||
| String.format("Attempt %d: backoff should be 0-%d ms, got: %d", attemptNumber, | ||
| Math.round(expectedMaxValues[attemptNumber - 1]), backoff)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Let's introduce the
expectedBackoffvariable to avoid repeatingMath.round(expectedMaxValues[attemptNumber - 1]), this also makes the assertion more readable. - I am proposing to replace
0-%d mswithbetween 0 ms and %d ms. Seeing something like "should be 0-5 ms" is a bit weird. Either plain english (what I suggested), or the proper math notation seems better.
| assertTrue(backoff >= 0 && backoff <= Math.round(expectedMaxValues[attemptNumber - 1]), | |
| String.format("Attempt %d: backoff should be 0-%d ms, got: %d", attemptNumber, | |
| Math.round(expectedMaxValues[attemptNumber - 1]), backoff)); | |
| long expectedBackoff = Math.round(expectedMaxValues[attemptNumber - 1]); | |
| assertTrue(backoff >= 0 && backoff <= expectedBackoff, | |
| String.format("Attempt %d: backoff should be between 0 ms and %d ms, got: %d", attemptNumber, expectedBackoff, backoff)); |
| @Test | ||
| void testTransactionRetryBackoffRespectsMaximum() { | ||
| // Even at high attempt numbers, backoff should never exceed 500ms | ||
| for (int attemptNumber = 1; attemptNumber < 26; attemptNumber++) { | ||
| long backoff = ExponentialBackoff.calculateTransactionBackoffMs(attemptNumber); | ||
| assertTrue(backoff >= 0 && backoff <= 500, | ||
| String.format("Attempt %d: backoff should be capped at 500 ms, got: %d ms", attemptNumber, backoff)); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make ExponentialBackoff.TRANSACTION_MAX_MS package-access and annotate with @VisibleForTesting(otherwise = PRIVATE). Then use TRANSACTION_MAX_MS here in both the comment (as ExponentialBackoff.TRANSACTION_MAX_MS) and the code & the format argument. This way a reader does not have to look where the magic number "500" comes from, and if it ever changes, the test won't need to be updated.
|
|
||
| @Test | ||
| void testTransactionRetryBackoff() { | ||
| // Test that the backoff sequence follows the expected pattern with growth factor 1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We duplicate the value of TRANSACTION_GROWTH here, but not the other parameters TRANSACTION_BASE_MS and TRANSACTION_MAX_MS. We should either refer to all of them (without duplicating their values), or to none. I propose not to mention the parameters at all, because we simply test the calculateTransactionBackoffMs method, and the parameters are implied by the method.
The same suggestion applies to the Expected backoffs with jitter=1.0 and growth factor 1.5 comment in testCustomJitter.
| public class ExponentialBackoffTest { | ||
|
|
||
| @Test | ||
| void testTransactionRetryBackoff() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method we test here is called calculateTransactionBackoffMs.
- It does not have "retry" in the name.
- Shouldn't the test name be
testCalculateTransactionBackoffMs?
The same applies to the testTransactionRetryBackoffRespectsMaximum test name.
| private boolean canRunTests() { | ||
| /** | ||
| * See | ||
| * <a href="https://github.com/mongodb/specifications/blob/master/source/transactions-convenient-api/tests/README.md#retry-backoff-is-enforceds">Convenient API Prose Tests</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * <a href="https://github.com/mongodb/specifications/blob/master/source/transactions-convenient-api/tests/README.md#retry-backoff-is-enforceds">Convenient API Prose Tests</a>. | |
| * <a href="https://github.com/mongodb/specifications/blob/master/source/transactions-convenient-api/tests/README.md#retry-backoff-is-enforced">Retry Backoff is Enforced</a>. |
| failPointAdminDb.runCommand(Document.parse("{'configureFailPoint': 'failCommand', 'mode': {'times': 13}, " + "'data': {'failCommands': ['commitTransaction'], 'errorCode': 251}}")); | ||
|
|
||
| long withBackoffTime; | ||
| try (ClientSession session = client.startSession()) { | ||
| long startNanos = System.nanoTime(); | ||
| session.withTransaction(() -> { | ||
| collection.insertOne(session, Document.parse("{ _id : 'backoff-test-full-jitter' }")); | ||
| return null; | ||
| }); | ||
| long endNanos = System.nanoTime(); | ||
| withBackoffTime = TimeUnit.NANOSECONDS.toMillis(endNanos - startNanos); | ||
| } finally { | ||
| ExponentialBackoff.clearTestJitterSupplier(); | ||
| failPointAdminDb.runCommand(Document.parse("{'configureFailPoint': 'failCommand', 'mode': 'off'}")); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is still relevant. The code pieces that duplicate each other are https://github.com/nhachicha/mongo-java-driver/blob/f0973a1d537c91911d2be7c44b9960c1b2ddf9dd/driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java#L222-L237 and https://github.com/nhachicha/mongo-java-driver/blob/f0973a1d537c91911d2be7c44b9960c1b2ddf9dd/driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java#L239-L253.
| // Test 2: Run with jitter = 1 (full backoff) | ||
| ExponentialBackoff.setTestJitterSupplier(() -> 1.0); | ||
|
|
||
| failPointAdminDb.runCommand(Document.parse("{'configureFailPoint': 'failCommand', 'mode': {'times': 13}, " + "'data': {'failCommands': ['commitTransaction'], 'errorCode': 251}}")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will go away on its own when #1852 (comment) is done.
| long expectedWithBackoffTime = noBackoffTime + 2200; // 2.2 seconds as per spec | ||
| long actualDifference = Math.abs(withBackoffTime - expectedWithBackoffTime); | ||
|
|
||
| assertTrue(actualDifference < 1000, String.format("Expected withBackoffTime to be ~%dms (noBackoffTime %dms + 2200ms), " + "but got %dms. Difference: %dms (tolerance: 1000ms per spec)", expectedWithBackoffTime, noBackoffTime, withBackoffTime, actualDifference)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is still relevant. The code it applies to now is https://github.com/nhachicha/mongo-java-driver/blob/f0973a1d537c91911d2be7c44b9960c1b2ddf9dd/driver-sync/src/test/functional/com/mongodb/client/WithTransactionProseTest.java#L255-L260.
| long actualDifference = Math.abs(withBackoffTime - expectedWithBackoffTime); | ||
|
|
||
| assertTrue(actualDifference < 1000, String.format("Expected withBackoffTime to be ~% dms (noBackoffTime %d ms + 1800 ms), but" | ||
| + " got %d ms. Difference: %d ms (tolerance: 1000 ms per spec)", expectedWithBackoffTime, noBackoffTime, withBackoffTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use 500 ms instead of 1000 ms: https://github.com/mongodb/specifications/blob/9e2acc6c9f77977cdbc0705cd8e236ed4d1b6551/source/transactions-convenient-api/tests/README.md?plain=1#L102.
stIncMale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last reviewed commit is f0973a1.
| commandListener = new TestCommandListener(); | ||
|
|
||
| if (!isAsync()) { | ||
| // setting jitter to 0 to make test using withTransaction deterministic (i.e retries immediately) otherwise we might get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test -> tests?
|
|
||
| if (!isAsync()) { | ||
| ExponentialBackoff.clearTestJitterSupplier(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method looks sketchy. Let's write it such that every step is guaranteed regardless of the success of the previous steps (try-with-resources comes really handy here): stIncMale@a27a453.
Relevant specification changes:
JAVA-5950, JAVA-6046