Skip to content

Commit b2883d1

Browse files
committed
Review comments, modified criteria.
1 parent 8407130 commit b2883d1

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

keps/sig-network/20190424-NodeLocalDNS-beta-proposal.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ Here are some failure modes for the node-cache pod:
7272
3) OOMKilled - node-local-dns gets OOMKilled due to its own memory usage or some other component using up all memory resources on the node. There is a chance this will cause other disruptions on the node in addition to DNS downtime though.
7373
4) Upgrades to node-local-dns daemonset - There will be DNS downtime when node-local-dns pods shut down, until the new pods are up and running.
7474

75-
We are proposing a solution that will help in all these cases.
75+
We are proposing a solution that will help in all these cases. For beta, we will start providing enablement for HA, full implementation will be a GA criterion.
7676

7777
The proposal here is to use an additional listen IP for node-local-dns pod. The node-local-dns pod listens on the 169.254.20.10 IP address today. We will extend node-local-dns to listen on the kube-dns service IP as well. Requests to kube-dns service IP will be handled by node-local-dns pod when it is up. If it is unavailable, the requests will go to kube-dns endpoints instead. The determination of whether node-local-dns service is available will be done by an external component - This could be a new daemonset or new functionality in an existing daemonset that manages networking.
7878

@@ -139,7 +139,7 @@ The benefits of this approach are:
139139

140140
3) Disabling node-local-dns does not require any kubelet change either.
141141

142-
We still need some component to dynamically determine when to use node-local-dns and when to flip to kube-dns endpoints. This logic can be separated out into an independent container/pod whose function is to query for dns records on 169.254.20.10:53 and follow some threshold to either install or remove the NOTRACK rules. This can be a new Daemonset or combined into an existing Daemonset that is in HostNetwork mode and manages iptables rules in some way - for instance a CNI Daemonset.
142+
We still need some component to dynamically determine when to use node-local-dns and when to flip to kube-dns endpoints. This logic can be separated out into an independent container/pod whose function is to query for dns records on 169.254.20.10:53 and follow some threshold to either install or remove the NOTRACK rules. This can be a new Daemonset or combined into an existing Daemonset that is in HostNetwork mode and manages iptables rules in some way - for instance a CNI Daemonset. This component will handle adding all iptables rules needed for node-local-dns.
143143

144144
The caveat of this approach is that it only works in the iptables implementation of kube-proxy.
145145
Another observation is that the upstream dns server IP used by node-local-dns will differ from one setup to another since it is a dynamically allocated service IP. This doesn't appear to be a major concern.
@@ -151,22 +151,22 @@ Another observation is that the upstream dns server IP used by node-local-dns wi
151151
- [coredns-performance-nodecache](https://k8s-testgrid.appspot.com/sig-network-gce#gce-coredns-performance-nodecache)
152152
- [kube-dns-nodecache](https://k8s-testgrid.appspot.com/sig-network-gce#gci-gce-kube-dns-nodecache)
153153

154-
* An additional automated test to verify the failover behavior will be a nice-to-have for beta.
155154

156155
### Graduation Criteria
157156

158-
In order to graduate to beta,we need:
157+
In order to graduate to beta, we need:
159158

160-
* HA implementation for NodeLocal DNSCache that covers upgrade scenario.
161159
* Lock down the node-local-dns configmap so that Corefile cannot be modified directly.
162160

161+
* Enablement of HA for NodeLocal DNSCache. With this support, the iptables rules management can be seperated out to a different component.
163162

164163
### Alternatives
165164

166165
One suggestion for HA that has come up is to list multiple nameservers in the client pods' /etc/resolv.conf - both node-local-dns IP as well as kube-dns service IP.
167166
This is not recommended because the behavior is inconsistent depending on the client library. glibc 2.16+ and musl implementations send queries in parallel to both nameservers, so if we use both kube-dns IP as well as the link-local IP used by NodeLocal DNSCache, we could make the DNS query explosion problem worse. More queries means more conntrack entries and more DNATs.
168167
This workaround could be viable for client implementations that do round-robin.
169168

169+
Running 2 daemonsets of node-local-dns using the same listenIP - 169.254.20.10 via SO_REUSEPORT option. Upgrades will be done one daemonset at a time.
170170

171171
## Implementation History
172172

0 commit comments

Comments
 (0)