Skip to content

Commit e315d6c

Browse files
committed
Address comments
1 parent d5181dc commit e315d6c

File tree

1 file changed

+13
-5
lines changed
  • keps/sig-storage/5381-mutable-pv-affinity

1 file changed

+13
-5
lines changed

keps/sig-storage/5381-mutable-pv-affinity/README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -585,8 +585,10 @@ enhancement:
585585

586586
This feature involves changes to the kubelet, and APIServer. But they are not strongly coupled.
587587

588-
an n-3 kubelet will not able to fail the mis-scheduled pods. User can still manually delete the pods. Otherwise it should be fine.
589-
an new kubelt can also work with old APIServer. Although this should not happen.
588+
An n-3 kubelet will not able to fail the mis-scheduled pods. The mis-scheduled pods will stuck at ContainerCreating status.
589+
If the kubelet is upgraded afterwards, it will properly fail those pods.
590+
User can also manually delete the pods if they don't want to upgrade kubelet soon.
591+
If user does not actually update the PV nodeAffinity, there will be no such mis-scheduled pods and everything should be fine.
590592

591593
kube-scheduler is not directly affected.
592594
It just read the latest PV nodeAffinity for scheduling decision regardless of whether it's being updated or not.
@@ -651,9 +653,10 @@ PV `spec.nodeAffinity` becomes mutable.
651653
If a pod being scheduled to a node that is incompatible with the PV's nodeAffinity, the pod will fail.
652654
Previously, it will be stuck at `ContainerCreating` status.
653655

654-
This should be rare, since we don't allow PV nodeAffinity to be updated,
656+
This should be rare before enabling this feature, since we don't allow PV nodeAffinity to be updated,
655657
nor CSI driver can change the topology reported from NodeGetInfo.
656658
So this is only possible if the user edited the node labels manually, or is running an incompatible scheduler.
659+
Existing workflow will unlikely be affected by this behavior change.
657660

658661
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
659662

@@ -690,7 +693,7 @@ You can take a look at one potential example of such test in:
690693
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
691694
-->
692695

693-
Yes. unit test will verify the validation and kubelet behavior when the feature gate is enabled or disabled.
696+
Will add unit test to verify the validation and kubelet behavior when the feature gate is enabled or disabled.
694697

695698
### Rollout, Upgrade and Rollback Planning
696699

@@ -766,7 +769,12 @@ and operation of this feature.
766769
Recall that end users cannot usually observe component logs or access metrics.
767770
-->
768771

769-
See a previously Pending or ContainerCreating Pod now properly Running.
772+
If a Pod is previously stuck due to out-of-date PV nodeAffinity,
773+
now user can update the PV to correct the nodeAffinity, and see the Pod entering Running state eventually.
774+
For Pods stuck in ContainerCreating due to storage provider unable to attach the volume to the scheduled node,
775+
The Pod will be rejected by kubelet and re-created at the correct node.
776+
For Pods stuck in Pending due to no suitable node available,
777+
scheduler will retry to schedule the Pod according to the updated nodeAffinity.
770778

771779
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
772780

0 commit comments

Comments
 (0)