@@ -585,8 +585,10 @@ enhancement:
585585
586586This feature involves changes to the kubelet, and APIServer. But they are not strongly coupled.
587587
588- an n-3 kubelet will not able to fail the mis-scheduled pods. User can still manually delete the pods. Otherwise it should be fine.
589- an new kubelt can also work with old APIServer. Although this should not happen.
588+ An n-3 kubelet will not able to fail the mis-scheduled pods. The mis-scheduled pods will stuck at ContainerCreating status.
589+ If the kubelet is upgraded afterwards, it will properly fail those pods.
590+ User can also manually delete the pods if they don't want to upgrade kubelet soon.
591+ If user does not actually update the PV nodeAffinity, there will be no such mis-scheduled pods and everything should be fine.
590592
591593kube-scheduler is not directly affected.
592594It just read the latest PV nodeAffinity for scheduling decision regardless of whether it's being updated or not.
@@ -651,9 +653,10 @@ PV `spec.nodeAffinity` becomes mutable.
651653If a pod being scheduled to a node that is incompatible with the PV's nodeAffinity, the pod will fail.
652654Previously, it will be stuck at `ContainerCreating` status.
653655
654- This should be rare, since we don't allow PV nodeAffinity to be updated,
656+ This should be rare before enabling this feature , since we don't allow PV nodeAffinity to be updated,
655657nor CSI driver can change the topology reported from NodeGetInfo.
656658So this is only possible if the user edited the node labels manually, or is running an incompatible scheduler.
659+ Existing workflow will unlikely be affected by this behavior change.
657660
658661# ##### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
659662
@@ -690,7 +693,7 @@ You can take a look at one potential example of such test in:
690693https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
691694-->
692695
693- Yes. unit test will verify the validation and kubelet behavior when the feature gate is enabled or disabled.
696+ Will add unit test to verify the validation and kubelet behavior when the feature gate is enabled or disabled.
694697
695698# ## Rollout, Upgrade and Rollback Planning
696699
@@ -766,7 +769,12 @@ and operation of this feature.
766769Recall that end users cannot usually observe component logs or access metrics.
767770-->
768771
769- See a previously Pending or ContainerCreating Pod now properly Running.
772+ If a Pod is previously stuck due to out-of-date PV nodeAffinity,
773+ now user can update the PV to correct the nodeAffinity, and see the Pod entering Running state eventually.
774+ For Pods stuck in ContainerCreating due to storage provider unable to attach the volume to the scheduled node,
775+ The Pod will be rejected by kubelet and re-created at the correct node.
776+ For Pods stuck in Pending due to no suitable node available,
777+ scheduler will retry to schedule the Pod according to the updated nodeAffinity.
770778
771779# ##### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
772780
0 commit comments