-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch the deployment strategy based on external condition (PV type) #15168
Comments
@smarterclayton is it reasonable to emit a warning (event/condition/etc) saying that rolling with RWO will fail to roll? I don't think we should decide the strategy for the user automatically based on "external" inputs (like PVC type). Also we can maybe fail the rollout before we actually create the deployer pod when we know in advance the rollout will fail (rolling + rwo). |
Agreed with @mfojtik - we already do a lot of magic with triggers in the spec. I thought |
@Kargakis i corrected myself ;P |
I don't think the console is showing a special warning for this today, but sounds like something to consider if we know its always going to fail. |
The problem from the perspective of the Overview is that we don't get PVC details at all today. PVCs are relatively stable, might be something we could just list, or slow poll. @spadgett probably other things we could be showing relative to PVCs used by deployments, like this deployment config references PVCs that are not bound? |
@jwforres as far as i remember when the RWO volumes are bound to a DC with rolling strategy we fail but the error is hidden in events and it is not really clear ;-) (you get some nasty storage error)... :-) "Looks like you have RWO volume with rolling strategy, do you want to change it?" |
I don't know that it's a warning necessary - it's totally valid to do this
for a deployment. In fact, this is the correct way on openshift today to
do a DB at scale 1 on AWS or gce. So warning is a bit much. *But*, it's
probably something we should "inform" them of if they have scale > 1, and
they'd might be better off with recreate for scale 1 (the advantage of
rolling is that the new pod will complete the pull prior to the old pod
going down)
…On Thu, Jul 13, 2017 at 10:15 AM, Michal Fojtik ***@***.***> wrote:
@jwforres <https://github.com/jwforres> as far as i remember when the RWO
volumes are bound to a DC with rolling strategy we fail but the error is
hidden in events and it is not really clear ;-) (you get some nasty storage
error)...
Maybe time for:
[image: gsmarena_001]
<https://user-images.githubusercontent.com/44136/28170692-6ede73da-67e6-11e7-980d-8669c925065c.jpg>
:-)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15168 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p7akE3ycPOcwt_LeuAVmuaZDrgXxks5sNiZzgaJpZM4OV49y>
.
|
@smarterclayton if I have RWO PV and at the same time Amazon EBS and GCE based PVs only allow RWO mode and so if you set |
Something else is wrong, that's not how the system should behave. Rolling
deployment marks the old pod as deleted, which allows the cluster to detach
the volume. You're likely hitting a bug you should be reporting to @bchilds
…On Thu, Jul 13, 2017 at 11:00 AM, Marek Jelen ***@***.***> wrote:
@smarterclayton <https://github.com/smarterclayton> if I have RWO PV and
at the same time Rolling, the deployment gets always stuck, even with
replicas=1. E.g. in online we do for persistent DBs with Recreate
strategy by default, and so I went to Online Starter and took these
screenshots after switching from Recreate to Rolling.
[image: screen shot 2017-07-13 at 16 49 23]
<https://user-images.githubusercontent.com/156068/28172385-b446e7f4-67eb-11e7-8c7a-afd99d40480f.png>
[image: screen shot 2017-07-13 at 16 49 38]
<https://user-images.githubusercontent.com/156068/28172393-b93d721e-67eb-11e7-8fdb-824e978e7e67.png>
[image: screen shot 2017-07-13 at 16 55 48]
<https://user-images.githubusercontent.com/156068/28172557-27145ad2-67ec-11e7-96e9-b36f3069ae55.png>
[image: screen shot 2017-07-13 at 16 56 02]
<https://user-images.githubusercontent.com/156068/28172561-2b0fc0e0-67ec-11e7-9df5-91f11855d6fe.png>
[image: screen shot 2017-07-13 at 16 59 46]
<https://user-images.githubusercontent.com/156068/28172720-a7f12db0-67ec-11e7-9b6a-95360fff7ddb.png>
Amazon EBS and GCE based PVs only allow RWO mode and so if you set Rolling
on a database deployment with a PV with these technologies you will never
be able to trigger new deployment.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15168 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p32IjEeH7IFTiI-mApzpnb9nGboZks5sNjD8gaJpZM4OV49y>
.
|
@smarterclayton that is interesting :) During rolling strategy there has to be two pods (for replicas=1), these two pods are with high probability running on two different machines, RWO can be attached to only one pod, usually the underlaying tech can be attached to only one machine. When I trigger redeploy and it would behave as you describe, I will will loose the PV from the original pod, however the application in that pod is not aware of that and can write into the PV, that should be there, but is not, as per your description is detached from the pod. If
|
@smarterclayton when is the old pod marked as deleted? AFAIU until there new version is not live and ready we can not mark the old pod as deleted (and detach the persistent storage) as it will still receive traffic, since the endpoint will be listed in the service. Once the new pod is ready, the old pod is marked as terminating, and the endpoint is removed from the service, but we can not still detach the storage since we need to wait for the graceful shutdown, else we could be introducing a lot of application errors. And I hope we're not. |
@smarterclayton can you please follow up on the issue? thanks |
It's unlike that we will automate any sort of spec mutation to handle this case. |
@Kargakis yes |
@Kargakis @mfojtik could the warning also be shown directly in Plus would like to get some clarification on what @smarterclayton says regarding the behaviour of RWO volumes, that is still confusing to me and I am not the only one who thinks the behaviour is supposed to be different then what @smarterclayton says. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
/lifecycle frozen |
Rolling
strategy is not useful for for deployments with RWO PVs.Version
Steps To Reproduce
Rolling
strategyCurrent Result
When new deployment gets triggered, the deployment gets stuck.
Expected Result
The deployment strategy could be switched to
Recreate
to safe the user from the need to figure out the problem and then changing the strategy manually.Additional Information
N/A
The text was updated successfully, but these errors were encountered: