-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle EnvoyProxy Image version upgrades #1712
Comments
I am interested in picking this up :) |
thanks @cnvergence ! |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
@arkodg coming back to this after a while, could you please point me to where should I start? As for the E2E, I assume I should add a new scenario to the e2e test suite :) |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
hey I know @chauhanshubham was looking into a similar test for control plane upgrades which would invariably also upgrade envoy proxy, should we just collapse those two e2e tests into one where we perform an upgrade with a last known EG minor version, and ensure that
|
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
I'm concerned that a hitless in-place upgrade of envoy is not trivial. A graceful termination of envoy may require:
It's also important to avoid race conditions where a new instance of envoy is receiving traffic before it was configured (e.g. due to order of component restart, failures in new control plane version, etc.). Some prior art: |
I executed a naive test:
The upgrade caused some client-facing failures during the test:
It's probably possible to tune some of the parameters mentioned in my previous comment to achieve a hitless upgrade under certain test conditions (RPS, connection reuse, HTTP version, ...). But, I'm not sure that we can claim to have a hitless upgrade in general, based on such test. So, I propose that for the GA scope, we focus on an upgrade test that ensures request convergence to successful execution after the upgrade. A limited hitless upgrade test can be a stretch-goal. In the future, we can explore:
WDYT? |
hey @guydc I was hoping we could have some test for hitless upgrade in v1.0, with caveats, that can hopefully we removed over time post GA |
this should be fixed with #2633, keeping this open so that it can be validated with a e2e |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
fixed with #2862 |
No description provided.
The text was updated successfully, but these errors were encountered: