-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds request retry attributes to HTTPRoute API Type #184
Conversation
@danehans: GitHub didn't allow me to request PR reviews from the following users: Miciah. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: danehans The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
api/v1alpha1/httproute_types.go
Outdated
// Support: Core | ||
// | ||
// +optional | ||
Timeout *metav1.Duration `json:"timeout,omitempty" protobuf:"bytes,3,opt,name=timeout"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this is read timeout, should we rename this to ReadTimeout?
We should be explicit here if this is TTFB (time to first byte) or not. It seems most proxies have this meaning.
Do we also need Connect Timeout for time to define timeout for the TCP handshake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL at my latest changes and let me know if you feel this is a good starting point for connection timeouts and retries.
api/v1alpha1/httproute_types.go
Outdated
// +optional | ||
Timeout *metav1.Duration `json:"timeout,omitempty" protobuf:"bytes,3,opt,name=timeout"` | ||
|
||
// RetryPolicy describes a policy for resending a failed request to a targetRef. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What defines a request as failed?
Will it be an L4 failure or L7 failure (idempotent HTTP methods)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the docs. A connection is considered failed when the connection is either refused (5xx) or times out.t
@hbagdi I just pushed |
@danehans: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I think this may overlap with @bowei's Service level config proposal. |
@@ -264,6 +264,59 @@ type ForwardToTarget struct { | |||
// | |||
// +optional | |||
TargetPort *TargetPort `json:"targetPort" protobuf:"bytes,2,opt,name=targetPort"` | |||
|
|||
// TimeoutPolicy describes a policy for configuring connection timeouts to targetRef. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how much we want to cater to specific implementations but I don't think Envoy supports this, specifically the timeout and retry config is at the route level, so you cannot set it differently for each weighted cluster destination.
Note that Knative has deprecated managing connection timeouts: https://github.com/knative/networking/blob/4b21f11ccfa7fbd2ab5c0a978e78e79f4fe728d8/pkg/apis/networking/v1alpha1/ingress_types.go#L240 |
// | ||
// Support: Core | ||
// | ||
Response metav1.Duration `json:"response" protobuf:"bytes,1,opt,name=response"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an inactivity timeout or a deadline to complete the transfer? The field's description is suggests it is a deadline (for example, if the response is large, and it takes too long, a proxy may terminate the transfer even while data are being transferred), in which case it should be named as follows:
Response metav1.Duration `json:"response" protobuf:"bytes,1,opt,name=response"` | |
ResponseDeadlineSeconds metav1.Duration `json:"response" protobuf:"bytes,1,opt,name=responseDeadlineSeconds"` |
On the other hand, if this is an inactivity timeout (for example, if the server doesn't send any data within this amount of time, a proxy may terminate the connection), then it should be named as follows:
Response metav1.Duration `json:"response" protobuf:"bytes,1,opt,name=response"` | |
ResponseTimeoutSeconds metav1.Duration `json:"response" protobuf:"bytes,1,opt,name=responseTimeoutSeconds"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, maybe "Seconds" should be omitted for metav1.Duration
, which is unmarshalled using time.ParseDuration
and thus accepts time-unit suffixes (https://github.com/kubernetes/apimachinery/blob/2456ebdaba229616fab2161a615148884b46644b/pkg/apis/meta/v1/duration.go#L31-L45). Oddly, I cannot find any APIs in k8s.io/api that use metav1.Duration
. Do you have examples of Kubernetes APIs that do use metav1.Duration
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, https://github.com/kubernetes/cloud-provider/blob/master/config/types.go has some examples of fields that use metav1.Duration
.
So I think the field should be named either responseDeadline
or responseTimeout
.
// | ||
// Support: Core | ||
// | ||
Idle metav1.Duration `json:"idle" protobuf:"bytes,2,opt,name=idle"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idle metav1.Duration `json:"idle" protobuf:"bytes,2,opt,name=idle"` | |
IdleTimeout metav1.Duration `json:"idle" protobuf:"bytes,2,opt,name=idleTimeout"` |
// | ||
// Support: Core | ||
// | ||
Interval metav1.Duration `json:"interval" protobuf:"bytes,2,opt,name=interval"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interval metav1.Duration `json:"interval" protobuf:"bytes,2,opt,name=interval"` | |
retryPeriod metav1.Duration `json:"interval" protobuf:"bytes,2,opt,name=retryPeriod"` |
@danehans: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@@ -658,6 +672,24 @@ message LocalObjectReference { | |||
optional string name = 3; | |||
} | |||
|
|||
// RetryPolicy describes a policy for retrying a connection until the | |||
// connection is either refused or times out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Envoy has more fine-grained policy setting.
And Knative (net-contour) leverages it - https://github.com/knative-sandbox/net-contour/blob/42ebe377c7c9d1e62fe4265ff3ef0cee285e8be3/pkg/reconciler/contour/resources/httpproxy.go#L80-L95
So it would be great if we could have some optional field to customize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we would like to know the default retry policy.
From 4/14/21 community meeting:
|
I think that with GEP #713 we've determined that this is a better fit for policy attachment than being built into the route. I'm going to close this in favor of that approach but anyone can feel free to reopen if needed. /close |
@robscott: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Extends
ForwardToTarget
of typeHTTPRoute
to include per-target request timeouts and retries.Partially fixes #58.
/assign @bowei
/cc @Miciah @robscott