-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce compute listener #110400
Introduce compute listener #110400
Conversation
fb4657b
to
b560b5a
Compare
Hi @dnhatn, I've created a changelog YAML for you. |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 on collecting the listener behavior into a named class.
Do you think it's appropriate to pull that class into it's own top level file? Do you think it's appropriate to make a test case specifically for it?
I know this has been quite the distraction and I'm proposing more distractions, but the listener behavior is pretty complex and a test might help other people to hold it in their head.
import java.util.concurrent.atomic.AtomicReference; | ||
|
||
/** | ||
* Collects exceptions occurred in compute, specifically ignores trivial exceptions such as {@link TaskCancelledException}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment isn't quite right I think - if there aren't any exceptions captured it'll capture the first "trivial exception". And it'll replace that with the first non-trivial exception it sees.
if (ExceptionsHelper.unwrapCause(first) != ExceptionsHelper.unwrapCause(e)) { | ||
if (exceptionPermits.tryAcquire()) { | ||
first.addSuppressed(e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAndUpdate
is documented as requiring a side-effect-free function because it'll be retried sometimes. I think this is safe, but it's not really side effect free. Maybe it'd be more readable if we used an atomicarray or something. Either way, I spent a while reading this code and thinking about it, but I don't think the performance is all that critical here because it's deep into exception path.
@nik9000 Thank you for the feedback. I have made the ComputeLister its own class and added tests for it. I also updated the javadoc and implementation of FailureCollector to use concurrent queues. This should be ready for review again. Could you please take another look? |
Thanks Nik! |
Currently, if a child request fails, we automatically trigger cancellation for ES|QL requests. This can result in TaskCancelledException being collected by the RefCountingListener first, which then returns that exception to the caller. For example, if we encounter a CircuitBreakingException (429), we might incorrectly return a TaskCancelledException (400) instead. This change introduces the ComputeListener, a variant of RefCountingListener, which selects the most appropriate exception to return to the caller. I also integrated the following features into ComputeListener to simplify ComputeService: - Automatic cancellation of sub-tasks on failure. - Collection of profiles from sub-tasks. - Collection of response headers from sub-tasks.
Currently, if a child request fails, we automatically trigger cancellation for ES|QL requests. This can result in TaskCancelledException being collected by the RefCountingListener first, which then returns that exception to the caller. For example, if we encounter a CircuitBreakingException (429), we might incorrectly return a TaskCancelledException (400) instead. This change introduces the ComputeListener, a variant of RefCountingListener, which selects the most appropriate exception to return to the caller. I also integrated the following features into ComputeListener to simplify ComputeService: - Automatic cancellation of sub-tasks on failure. - Collection of profiles from sub-tasks. - Collection of response headers from sub-tasks.
Currently, if a child request fails, we automatically trigger cancellation for ES|QL requests. This can result in TaskCancelledException being collected by the RefCountingListener first, which then returns that exception to the caller. For example, if we encounter a CircuitBreakingException (429), we might incorrectly return a TaskCancelledException (400) instead. This change introduces the ComputeListener, a variant of RefCountingListener, which selects the most appropriate exception to return to the caller. I also integrated the following features into ComputeListener to simplify ComputeService: - Automatic cancellation of sub-tasks on failure. - Collection of profiles from sub-tasks. - Collection of response headers from sub-tasks.
Currently, if a child request fails, we automatically trigger cancellation for ES|QL requests. This can result in TaskCancelledException being collected by the RefCountingListener first, which then returns that exception to the caller. For example, if we encounter a CircuitBreakingException (429), we might incorrectly return a TaskCancelledException (400) instead. This change introduces the ComputeListener, a variant of RefCountingListener, which selects the most appropriate exception to return to the caller. I also integrated the following features into ComputeListener to simplify ComputeService: - Automatic cancellation of sub-tasks on failure. - Collection of profiles from sub-tasks. - Collection of response headers from sub-tasks.
ComputeResponse from old nodes may have a null value instead of an empty list for profiles. Relates elastic#110400 Closes elastic#110591
ComputeResponse from old nodes may have a null value instead of an empty list for profiles. Relates elastic#110400 Closes elastic#110591
Currently, if a child request fails, we automatically trigger cancellation for ES|QL requests. This can result in TaskCancelledException being collected by the RefCountingListener first, which then returns that exception to the caller. For example, if we encounter a CircuitBreakingException (429), we might incorrectly return a TaskCancelledException (400) instead.
This change introduces the ComputeListener, a variant of RefCountingListener, which ignores unimportant exceptions and selects the most appropriate exception to return to the caller. I also integrated the following features into ComputeListener to simplify ComputeService: