Skip to content

Commit 953c322

Browse files
daschlMichael Nitschinger
authored andcommitted
SPY-176: Enhance redistribution logic and avoid possible deadlocks.
Motivation ---------- There have been issues reported that redistribution of operations does not work as expected, especially with authentication scenarios. This has been tracked down and the following changes have been made: Modifications ------------- - With the old redistribute logic, it could happen that subsequent ops in the retry queue got accidentally deleted. With the copy first, this cannot happen anymore. - On redistribute, if the handling node is still not set, just clone the operation to avoid NPEs. A op without a node set can happen if it is enqueued to retry because the target node is not yet authenticated. - Do not try to add operations to a node which is not yet authen ticated. This can lead to costly locks with redistributions since they are run from the IO thread. Without the change, it can happen that the IO thread waits for an auth latch, but is also responsible for telling listeners when auth has completed, therefore locking everything up until the auth latch wait runs out of time. Result ------ Much better resilience and performance with redistributions, especially if authentication takes longer than expected and from scenarios where the operations get redistributed/moved around from within the IO thread. Change-Id: Icbc5f9e4f568ea885500e8d2baedfa989c8ef801 Reviewed-on: http://review.couchbase.org/38669 Reviewed-by: Matt Ingenthron <matt@couchbase.com> Reviewed-by: Michael Nitschinger <michael.nitschinger@couchbase.com> Tested-by: Michael Nitschinger <michael.nitschinger@couchbase.com>
1 parent ca3c8d4 commit 953c322

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

src/main/java/net/spy/memcached/MemcachedConnection.java

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -473,8 +473,9 @@ private void handleOperationalTasks() throws IOException {
473473
}
474474

475475
if (!retryOps.isEmpty()) {
476-
redistributeOperations(new ArrayList<Operation>(retryOps));
476+
ArrayList<Operation> operations = new ArrayList<Operation>(retryOps);
477477
retryOps.clear();
478+
redistributeOperations(operations);
478479
}
479480

480481
handleShutdownQueue();
@@ -1042,7 +1043,7 @@ public void redistributeOperation(Operation op) {
10421043

10431044
// The operation gets redistributed but has never been actually written,
10441045
// it we just straight re-add it without cloning.
1045-
if (op.getState() == OperationState.WRITE_QUEUED) {
1046+
if (op.getState() == OperationState.WRITE_QUEUED && op.getHandlingNode() != null) {
10461047
addOperation(op.getHandlingNode(), op);
10471048
return;
10481049
}
@@ -1257,6 +1258,10 @@ public void insertOperation(final MemcachedNode node, final Operation o) {
12571258
* @param o the operation to add.
12581259
*/
12591260
protected void addOperation(final MemcachedNode node, final Operation o) {
1261+
if (!node.isAuthenticated()) {
1262+
retryOperation(o);
1263+
return;
1264+
}
12601265
o.setHandlingNode(node);
12611266
o.initialize();
12621267
node.addOp(o);

0 commit comments

Comments
 (0)