Clarification and/or configurability of what it means for a server to be dead/down

While working with @djcarlin and @bryancall to understand an issue from deploying the dead server no retry feature added in PR #7142, we were surprised by when exactly servers are marked dead.

In the case where transactions are retryable, things pretty much worked as expected.  If the handshake failed or the request is sent but the origin fails to return data and is retryable, the address is tried the number of times specified in proxy.config.http.connect_attempts_rr_retries (at least once PR #7288 is applied) before marking the IP address as down and moving onto the next IP address.

However, if the transaction failed after sending the header and it is not retryable (e.g. a POST request), the ip address is marked down immediately (the retry count in proxy.config.http.connect_attempts_rr_retries is ignored).  If the origin only times out now and again due to larger requests, taking it down immediately seems bad particularly using the new feature that avoids the retries against the down server in the dead period.  However, if the server is consistently failing to respond to post requests, it should be marked down.

Probably this down decision criteria needs to be configurable.  Some origins need different criteria than others.  Some should only be marked down in the initial handshake fails.  Others should be marked down but only if no data was returned.  Or maybe you want to mark things down only for specific origin connection failures.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification and/or configurability of what it means for a server to be dead/down #7290

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification and/or configurability of what it means for a server to be dead/down #7290

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions