|  | 
|  | 1 | ++++ | 
|  | 2 | +title = "AWS Placement Group Node Fature Discovery" | 
|  | 3 | ++++ | 
|  | 4 | + | 
|  | 5 | +The AWS placement group NFD (Node Feature Discovery) customization automatically discovers and labels nodes with their placement group information, enabling workload scheduling based on placement group characteristics. | 
|  | 6 | + | 
|  | 7 | +This customization will be available when the | 
|  | 8 | +[provider-specific cluster configuration patch]({{< ref "..">}}) is included in the `ClusterClass`. | 
|  | 9 | + | 
|  | 10 | +## What is Placement Group NFD? | 
|  | 11 | + | 
|  | 12 | +Placement Group NFD automatically discovers the placement group information for each node and creates node labels that can be used for workload scheduling. This enables: | 
|  | 13 | + | 
|  | 14 | +- **Workload Affinity**: Schedule pods on nodes within the same placement group for low latency | 
|  | 15 | +- **Fault Isolation**: Schedule critical workloads on nodes in different placement groups | 
|  | 16 | +- **Resource Optimization**: Use placement group labels for advanced scheduling strategies | 
|  | 17 | + | 
|  | 18 | +## How it Works | 
|  | 19 | + | 
|  | 20 | +The NFD customization: | 
|  | 21 | + | 
|  | 22 | +1. **Deploys a Discovery Script**: Automatically installs a script on each node that queries AWS metadata | 
|  | 23 | +2. **Queries AWS Metadata**: Uses EC2 instance metadata to discover placement group information | 
|  | 24 | +3. **Creates Node Labels**: Generates Kubernetes node labels with placement group details | 
|  | 25 | +4. **Updates Continuously**: Refreshes labels as nodes are added or moved | 
|  | 26 | + | 
|  | 27 | +## Generated Node Labels | 
|  | 28 | + | 
|  | 29 | +The NFD customization creates the following node labels: | 
|  | 30 | + | 
|  | 31 | +| Label | Description | Example | | 
|  | 32 | +|-------|-------------|---------| | 
|  | 33 | +| `feature.node.kubernetes.io/aws-placement-group` | The name of the placement group | `my-cluster-pg` | | 
|  | 34 | +| `feature.node.kubernetes.io/partition` | The partition number (for partition placement groups) | `0`, `1`, `2` | | 
|  | 35 | + | 
|  | 36 | +## Configuration | 
|  | 37 | + | 
|  | 38 | +The placement group NFD customization is automatically enabled when a placement group is configured. No additional configuration is required. | 
|  | 39 | + | 
|  | 40 | +```yaml | 
|  | 41 | +apiVersion: cluster.x-k8s.io/v1beta1 | 
|  | 42 | +kind: Cluster | 
|  | 43 | +metadata: | 
|  | 44 | +  name: <NAME> | 
|  | 45 | +spec: | 
|  | 46 | +  topology: | 
|  | 47 | +    variables: | 
|  | 48 | +      - name: clusterConfig | 
|  | 49 | +        value: | 
|  | 50 | +          controlPlane: | 
|  | 51 | +            aws: | 
|  | 52 | +              placementGroup: | 
|  | 53 | +                name: "control-plane-pg" | 
|  | 54 | +      - name: workerConfig | 
|  | 55 | +        value: | 
|  | 56 | +          aws: | 
|  | 57 | +            placementGroup: | 
|  | 58 | +              name: "worker-pg" | 
|  | 59 | +``` | 
|  | 60 | +
 | 
|  | 61 | +## Usage Examples | 
|  | 62 | +
 | 
|  | 63 | +### Workload Affinity | 
|  | 64 | +
 | 
|  | 65 | +Schedule pods on nodes within the same placement group for low latency: | 
|  | 66 | +
 | 
|  | 67 | +```yaml | 
|  | 68 | +apiVersion: apps/v1 | 
|  | 69 | +kind: Deployment | 
|  | 70 | +metadata: | 
|  | 71 | +  name: high-performance-app | 
|  | 72 | +spec: | 
|  | 73 | +  replicas: 3 | 
|  | 74 | +  selector: | 
|  | 75 | +    matchLabels: | 
|  | 76 | +      app: high-performance-app | 
|  | 77 | +  template: | 
|  | 78 | +    metadata: | 
|  | 79 | +      labels: | 
|  | 80 | +        app: high-performance-app | 
|  | 81 | +    spec: | 
|  | 82 | +      affinity: | 
|  | 83 | +        nodeAffinity: | 
|  | 84 | +          requiredDuringSchedulingIgnoredDuringExecution: | 
|  | 85 | +            nodeSelectorTerms: | 
|  | 86 | +            - matchExpressions: | 
|  | 87 | +              - key: feature.node.kubernetes.io/aws-placement-group | 
|  | 88 | +                operator: In | 
|  | 89 | +                values: ["worker-pg"] | 
|  | 90 | +      containers: | 
|  | 91 | +      - name: app | 
|  | 92 | +        image: my-app:latest | 
|  | 93 | +``` | 
|  | 94 | +
 | 
|  | 95 | +### Fault Isolation | 
|  | 96 | +
 | 
|  | 97 | +Distribute critical workloads across different placement groups: | 
|  | 98 | +
 | 
|  | 99 | +```yaml | 
|  | 100 | +apiVersion: apps/v1 | 
|  | 101 | +kind: Deployment | 
|  | 102 | +metadata: | 
|  | 103 | +  name: critical-app | 
|  | 104 | +spec: | 
|  | 105 | +  replicas: 6 | 
|  | 106 | +  selector: | 
|  | 107 | +    matchLabels: | 
|  | 108 | +      app: critical-app | 
|  | 109 | +  template: | 
|  | 110 | +    metadata: | 
|  | 111 | +      labels: | 
|  | 112 | +        app: critical-app | 
|  | 113 | +    spec: | 
|  | 114 | +      affinity: | 
|  | 115 | +        podAntiAffinity: | 
|  | 116 | +          requiredDuringSchedulingIgnoredDuringExecution: | 
|  | 117 | +          - labelSelector: | 
|  | 118 | +              matchExpressions: | 
|  | 119 | +              - key: app | 
|  | 120 | +                operator: In | 
|  | 121 | +                values: ["critical-app"] | 
|  | 122 | +            topologyKey: feature.node.kubernetes.io/aws-placement-group | 
|  | 123 | +      containers: | 
|  | 124 | +      - name: app | 
|  | 125 | +        image: critical-app:latest | 
|  | 126 | +``` | 
|  | 127 | +
 | 
|  | 128 | +### Partition-Aware Scheduling | 
|  | 129 | +
 | 
|  | 130 | +For partition placement groups, schedule workloads on specific partitions: | 
|  | 131 | +
 | 
|  | 132 | +```yaml | 
|  | 133 | +apiVersion: apps/v1 | 
|  | 134 | +kind: StatefulSet | 
|  | 135 | +metadata: | 
|  | 136 | +  name: distributed-database | 
|  | 137 | +spec: | 
|  | 138 | +  replicas: 3 | 
|  | 139 | +  selector: | 
|  | 140 | +    matchLabels: | 
|  | 141 | +      app: distributed-database | 
|  | 142 | +  template: | 
|  | 143 | +    metadata: | 
|  | 144 | +      labels: | 
|  | 145 | +        app: distributed-database | 
|  | 146 | +    spec: | 
|  | 147 | +      affinity: | 
|  | 148 | +        nodeAffinity: | 
|  | 149 | +          requiredDuringSchedulingIgnoredDuringExecution: | 
|  | 150 | +            nodeSelectorTerms: | 
|  | 151 | +            - matchExpressions: | 
|  | 152 | +              - key: feature.node.kubernetes.io/partition | 
|  | 153 | +                operator: In | 
|  | 154 | +                values: ["0", "1", "2"] | 
|  | 155 | +      containers: | 
|  | 156 | +      - name: database | 
|  | 157 | +        image: my-database:latest | 
|  | 158 | +``` | 
|  | 159 | +
 | 
|  | 160 | +## Verification | 
|  | 161 | +
 | 
|  | 162 | +You can verify that the NFD labels are working by checking the node labels: | 
|  | 163 | +
 | 
|  | 164 | +```bash | 
|  | 165 | +# Check all nodes and their placement group labels | 
|  | 166 | +kubectl get nodes --show-labels | grep placement-group | 
|  | 167 | + | 
|  | 168 | +# Check specific node labels | 
|  | 169 | +kubectl describe node <node-name> | grep placement-group | 
|  | 170 | + | 
|  | 171 | +# Check partition labels | 
|  | 172 | +kubectl get nodes --show-labels | grep partition | 
|  | 173 | +``` | 
|  | 174 | + | 
|  | 175 | +## Troubleshooting | 
|  | 176 | + | 
|  | 177 | +### Check NFD Script Status | 
|  | 178 | + | 
|  | 179 | +Verify that the discovery script is running: | 
|  | 180 | + | 
|  | 181 | +```bash | 
|  | 182 | +# Check if the script exists on nodes | 
|  | 183 | +kubectl debug node/<node-name> -it --image=busybox -- chroot /host ls -la /etc/kubernetes/node-feature-discovery/source.d/ | 
|  | 184 | + | 
|  | 185 | +# Check script execution | 
|  | 186 | +kubectl debug node/<node-name> -it --image=busybox -- chroot /host cat /etc/kubernetes/node-feature-discovery/features.d/placementgroup | 
|  | 187 | +``` | 
|  | 188 | + | 
|  | 189 | +## Integration with Other Features | 
|  | 190 | + | 
|  | 191 | +Placement Group NFD works seamlessly with: | 
|  | 192 | + | 
|  | 193 | +- **Pod Affinity/Anti-Affinity**: Use placement group labels for advanced scheduling | 
|  | 194 | +- **Topology Spread Constraints**: Distribute workloads across placement groups | 
|  | 195 | + | 
|  | 196 | +## Security Considerations | 
|  | 197 | + | 
|  | 198 | +- The discovery script queries AWS instance metadata (IMDSv2) | 
|  | 199 | +- No additional IAM permissions are required beyond standard node permissions | 
|  | 200 | +- Labels are automatically managed and do not require manual intervention | 
|  | 201 | +- The script runs with appropriate permissions and security context | 
0 commit comments