Keep a freshly vended spoke from deadlocking on IPs, arch, and Karpenter#84
Merged
Conversation
A freshly vended dev spoke (2× m7g.large bootstrap nodes) syncing the full addon catalog deadlocked: pods stuck ContainerCreating with "cilium-cni: no IPs available", and Karpenter — stranded on a saturated bootstrap node with no DNS — couldn't reach the EC2 API to launch the nodes that would have relieved the pressure. Three independent root causes, each fixed here: Cilium ENI IP cap. ENI mode hands out single secondary IPs, capping m7g.large at ~35 pods — below what the catalog needs on the bootstrap nodes before Karpenter scales out. Enable awsEnablePrefixDelegation so each ENI carries /28 prefixes instead, lifting the cap ~4x (~110) and removing the IP-exhaustion that also starved CoreDNS and Karpenter. NodePool architecture. Both the default and sandbox NodePools required kubernetes.io/arch In [amd64], but Graviton/arm64 is the org default — the bootstrap nodes are m7g and the agent/sandbox images are arm64. An amd64 node provisioned here would exec-format-crash the arm64 pods scheduled onto it. Pin both pools to arm64. Karpenter priority. Karpenter is the only thing that can relieve a saturated cluster, so it must never be the pod that gets evicted or stranded. Give the controller priorityClassName system-cluster-critical so the scheduler preempts lower-priority pods to keep it running. Prefix-delegation removes the IP-exhaustion root cause directly; a separate smoke/full addon profile (skip the heavy optional catalog on dev spokes entirely) is tracked as a follow-up. Claude-Session: https://claude.ai/code/session_01R6rXpE1FZAVS14zanDdgb7
CI Results
All checks passed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Symptom
A freshly vended dev spoke (2× m7g.large bootstrap nodes) syncing the full addon catalog deadlocked:
ContainerCreating—cilium-cni: no IPs availableThree root causes, each fixed
Cilium ENI IP cap. ENI mode hands out single secondary IPs, capping m7g.large at ~35 pods — below what the catalog needs on the bootstrap nodes before Karpenter scales out.
eni.awsEnablePrefixDelegation: truegives each ENI/28prefixes instead, lifting the cap ~4x (~110) and removing the IP-exhaustion that also starved CoreDNS and Karpenter.NodePool architecture. Both the
defaultandsandboxNodePools requiredkubernetes.io/arch In [amd64], but Graviton/arm64 is the org default — the bootstrap nodes are m7g and the agent/sandbox images are arm64. An amd64 node provisioned here would exec-format-crash the arm64 pods scheduled onto it. Both pools now pinarm64.Karpenter priority. Karpenter is the only thing that can relieve a saturated cluster, so it must never be the pod that gets evicted or stranded. The controller now carries
priorityClassName: system-cluster-critical.Scope
Prefix-delegation removes the IP-exhaustion root cause directly. A separate smoke/full addon profile (skip the heavy optional catalog on dev spokes entirely — and the appset split + landing-zone label it needs) is tracked in #83.
task validategreen (yaml lint + all kustomize overlays build; both NodePools resolvearm64).Closes #82
https://claude.ai/code/session_01R6rXpE1FZAVS14zanDdgb7