React more prompty to task cancellation while waiting for the cluster to unblock #128737

nielsbauman · 2025-06-02T08:57:46Z

Instead of waiting for the next run of the ClusterStateObserver (which might be arbitrarily far in the future, but bound by the timeout if one is set), we notify the listener immediately that the task has been cancelled. While doing so, we ensure we invoke the listener only once.

Fixes #117971

… to unblock Instead of waiting for the next run of the `ClusterStateObserver` (which might be arbitrarily far in the future, but bound by the timeout if one is set), we notify the listener immediately that the task has been cancelled. While doing so, we ensure we invoke the listener only once. Fixes elastic#117971

elasticsearchmachine · 2025-06-02T08:58:13Z

Hi @nielsbauman, I've created a changelog YAML for you.

elasticsearchmachine · 2025-06-02T08:58:15Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

DaveCTurner

Looks good, just a naming/comment nit.

DaveCTurner · 2025-06-02T09:05:09Z

...r/src/main/java/org/elasticsearch/action/support/local/TransportLocalClusterStateAction.java

@@ -104,20 +106,40 @@ private void waitForClusterUnblock(
            logger,
            clusterService.threadPool().getThreadContext()
        );
+        // We track whether we already notified the listener of cancellation, to avoid invoking the listener twice.
+        final var notifiedCancellation = new AtomicBoolean(false);


Naming is a bit odd, this gets set to true even if we haven't notified the listener of cancellation. Maybe waitComplete?

I was going to suggest using org.elasticsearch.action.ActionListener#notifyOnce but it's more subtle than that: we want to suppress the cancellation listener before calling innerDoExecute. I think that deserves a comment.

I renamed the variable to notifiedListener. Even though we don't notify the listener directly in onNewClusterState, we do in innerDoExecute. I also updated the comment. Let me know what you think.

Yeah I think that's still going to be confusing to some future reader. The flag doesn't indicate we've completed the listener, it indicates that the wait for an appropriate cluster state in TransportLocalClusterStateAction is over and we've started to execute the action, so I think waitComplete would be a better name.

Alright, renamed to waitComplete in 3c6aafa.

DaveCTurner

LGTM, but also consider extracting a utility for this pattern to simplify the compareAndSet things for future readers. ... (false, true) == false is pretty hard to parse. Something like this perhaps?

diff --git a/libs/core/src/main/java/org/elasticsearch/core/Predicates.java b/libs/core/src/main/java/org/elasticsearch/core/Predicates.java
index bd8c1517323..88c4f138967 100644
--- a/libs/core/src/main/java/org/elasticsearch/core/Predicates.java
+++ b/libs/core/src/main/java/org/elasticsearch/core/Predicates.java
@@ -9,6 +9,8 @@

 package org.elasticsearch.core;

+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.function.BooleanSupplier;
 import java.util.function.Predicate;

 /**
@@ -90,4 +92,22 @@ public enum Predicates {
     public static <T> Predicate<T> never() {
         return (Predicate<T>) NEVER;
     }
+
+    private static class OnceTrue extends AtomicBoolean implements BooleanSupplier {
+        OnceTrue() {
+            super(true);
+        }
+
+        @Override
+        public boolean getAsBoolean() {
+            return getAndSet(false);
+        }
+    }
+
+    /**
+     * @return a {@link BooleanSupplier} which supplies {@code true} the first time it is called, and {@code false} subsequently.
+     */
+    public static BooleanSupplier once() {
+        return new OnceTrue();
+    }
 }

nielsbauman · 2025-06-03T06:55:15Z

Great suggestion, thanks! Applied in 0925808.

DaveCTurner

LGTM still

… to unblock (elastic#128737) Instead of waiting for the next run of the `ClusterStateObserver` (which might be arbitrarily far in the future, but bound by the timeout if one is set), we notify the listener immediately that the task has been cancelled. While doing so, we ensure we invoke the listener only once. Fixes elastic#117971

nielsbauman requested a review from DaveCTurner June 2, 2025 08:57

nielsbauman added >enhancement :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. Team:Distributed Coordination Meta label for Distributed Coordination team v9.1.0 labels Jun 2, 2025

Update docs/changelog/128737.yaml

30af288

DaveCTurner reviewed Jun 2, 2025

View reviewed changes

nielsbauman added 3 commits June 2, 2025 13:09

Rename variable and update comment

352a93b

Rename variable

3c6aafa

Merge branch 'main' into task-cancellation

3896224

DaveCTurner approved these changes Jun 3, 2025

View reviewed changes

Extract utility

0925808

nielsbauman requested a review from a team as a code owner June 3, 2025 06:55

nielsbauman enabled auto-merge (squash) June 3, 2025 06:55

DaveCTurner approved these changes Jun 3, 2025

View reviewed changes

nielsbauman merged commit f988611 into elastic:main Jun 3, 2025
17 of 18 checks passed

nielsbauman deleted the task-cancellation branch June 3, 2025 08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

React more prompty to task cancellation while waiting for the cluster to unblock #128737

React more prompty to task cancellation while waiting for the cluster to unblock #128737

Uh oh!

nielsbauman commented Jun 2, 2025

Uh oh!

elasticsearchmachine commented Jun 2, 2025

Uh oh!

elasticsearchmachine commented Jun 2, 2025

Uh oh!

DaveCTurner left a comment

Uh oh!

DaveCTurner Jun 2, 2025

Uh oh!

nielsbauman Jun 2, 2025

Uh oh!

DaveCTurner Jun 3, 2025

Uh oh!

nielsbauman Jun 3, 2025

Uh oh!

DaveCTurner left a comment

Uh oh!

nielsbauman commented Jun 3, 2025

Uh oh!

DaveCTurner left a comment

Uh oh!

Uh oh!

Uh oh!

React more prompty to task cancellation while waiting for the cluster to unblock #128737

React more prompty to task cancellation while waiting for the cluster to unblock #128737

Uh oh!

Conversation

nielsbauman commented Jun 2, 2025

Uh oh!

elasticsearchmachine commented Jun 2, 2025

Uh oh!

elasticsearchmachine commented Jun 2, 2025

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

nielsbauman Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

nielsbauman Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

nielsbauman commented Jun 3, 2025

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!