-
Notifications
You must be signed in to change notification settings - Fork 266
feat: add retryable property for Error #1383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -134,6 +134,8 @@ pub struct Error { | |||
|
|||
source: Option<anyhow::Error>, | |||
backtrace: Backtrace, | |||
|
|||
retryable: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite confusing. Whether it's retryable should be determined by consumer by ErrorKind
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether it's retryable should be determined by consumer by
ErrorKind
.
My current idea is to introduce an ErrorStatus
:
temporary
: This error is temporary, and users can retry the operation if they want.permanent
: This error is permanent, so users can't retry it. Even if they do, it's highly likely that it won't work.
In this way, we convey the status, which is only known to us and is orthogonal of ErrorKind
instead of the final decision, retryable
.
For example, Unexpected
error can be permanent
and PreconditionFailed
error can be temporary
.
This is also very useful in cases like our REST catalog client. We can offer built-in retry support and convert a temporary
error into a permanent
one after several retry attempts.
I also think it would be a good idea to use a temporary: bool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite confusing. Whether it's retryable should be determined by consumer by ErrorKind.
temporary: bool
+1. I think the basic idea here is to let retryable or temporary property orthogonal of ErrorKind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, Unexpected error can be permanent and PreconditionFailed error can be temporary.
I don't quite understand why PreconditionFailed
can be temporrary. For example, PreconditionFailed
could be invalid input from user, and it should be permanent
.
I think the basic idea here is to let retryable or temporary property orthogonal of ErrorKind.
I'm still confusing about this part. Let's think about the use case, usually the caller of a function determines whether they need to retry, depending on the return error. For example, the user could decide to retry strategy when they face network timeout failure.
If we add the status
or temporary
field, it's the function telling its caller whether they should retry. My concern is that adding such a field a general purpose Error
could be misleading or confusing. If we add this field into a concrete ErrorKind
variant such as CommitConflict(bool)
, it makes more sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add the status or temporary field, it's the function telling its caller whether they should retry. My concern is that adding such a field a general purpose Error could be misleading or confusing.
This example makes sense to me.🤔 There are two kinds of retry process: 1. commit conflict, 2. other(maybe network timeout as @liurenjie1024 says). It seems that in the SDK, at least for now, we only try to retry the process when we meet CommitConflict. (In Java, exceptions only retry when they meet CommitFailedException). For other errors, maybe it should be determined by the user. And we can determine whether need to let retry orthogonal of ErrorKind until we have multiple kinds of errors that need to be retried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand why
PreconditionFailed
can be temporrary.
In cases like concurrent commit. Users can retry until this operation succeed or give up while retry times reached.
Take S3 as an example:
https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html
ConditionalRequestConflict
A conflicting operation occurred. If using PutObject you can retry the request. If using multipart upload you should initiate another CreateMultipartUpload request and re-upload each part.
If we add the
status
ortemporary
field, it's the function telling its caller whether they should retry.
I think that's exactly why I want to use "temporary" here. We're telling users that this error is temporary, allowing them to decide how to proceed based on their own context.
A temporary error doesn't necessarily mean it should be retried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In cases like concurrent commit. Users can retry until this operation succeed or give up while retry times reached.
Shouldn't this return by the catalog client as a CommitConflict?
Which issue does this PR close?
This PR is the first part of #964.
What changes are included in this PR?
Before we support retryable process, this PR add retryable property for Error.
Are these changes tested?