Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minikube to host communication not working on Fedora 37 #15573

Closed
mnk opened this issue Jan 2, 2023 · 23 comments
Closed

Minikube to host communication not working on Fedora 37 #15573

mnk opened this issue Jan 2, 2023 · 23 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@mnk
Copy link

mnk commented Jan 2, 2023

What Happened?

There seems to be a difference in minikube iptables rules when comparing a fully updated Fedora 36 and Fedora 37 system.
On Fedora 36:

$ sudo iptables -t nat -S|grep -e '--to-destination 127.0.0.11'
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:39397
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:34196

On Fedora 37:

$ sudo iptables -t nat -S|grep -e '--to-destination 127.0.0.11'
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p tcp -j DNAT --to-destination 127.0.0.11:46739
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p udp -j DNAT --to-destination 127.0.0.11:37392

The missing --dport 53 condition on the destination NAT breaks all non-DNS communication between host and minikube.
What might be causing this difference?

Attach the log file

log.txt

Operating System

Redhat/Fedora

Driver

Docker

@mnk
Copy link
Author

mnk commented Jan 4, 2023

Full output of minikube ssh sudo iptables-save:
iptables-save-f37.txt
iptables-save-f36.txt

@aojea
Copy link
Member

aojea commented Jan 4, 2023

hmm, do we know if was docker the one to drop the port?

I've seen a recent report in kubernetes about some weird iptables rules being mutated in Centos

@prezha
Copy link
Contributor

prezha commented Jan 5, 2023

@mnk thanks for reporting this issue

there are a couple of improvements i'm currently working on in draft pr #15463

could you please try with https://storage.googleapis.com/minikube-builds/15463/minikube-linux-amd64 and let us know if it works for you

if not, could you pull that pr and run:

make integration -e TEST_ARGS="-minikube-start-args='--driver=docker --container-runtime=docker --alsologtostderr -v=7' -test.run TestNetworkPlugins --cleanup=true"

then share the whole output you get

i've tried to replicate your setup (ie, fresh fedora 37 install [in kvm] + docker) and all the tests above passed for me, so i'm curious to know if it would work for you

@mnk
Copy link
Author

mnk commented Jan 6, 2023

@prezha , I tried the minikube build you linked to, but I still get the same result - no --dport 53 condition on the DNAT rule. Do I need to specify a base image or just do a minikube-linux-amd64 start?

Do you get the --dport 53 condition when testing with you branch?

My use-case can be tested by starting sshd on the host and then doing minikube ssh ssh $(id -un)@192.168.49.1. This works fine on Fedora 36 but not on Fedora 37.

@BenTheElder
Copy link
Member

BenTheElder commented Jan 6, 2023

@aojea as mentioned on the PR I also see no dport currently, on gLinux (~debian):

$ docker run -d --entrypoint=sleep --network=kind --privileged --name=aaa kindest/node:v1.25.3 infinity
07cc1460e1bf62a936e33775efdda0fbce577634eb06b07dcdd267bd855f9248

$ docker exec --privileged aaa iptables-save
# Generated by iptables-save v1.8.7 on Wed Jan  4 22:07:38 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -j DNAT --to-destination 127.0.0.11:44029
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -j DNAT --to-destination 127.0.0.11:33690
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -j SNAT --to-source :53
COMMIT

(NOTE: this skips the entrypoint logic etc, the point is to debug purely what docker is doing on its own with the embedded DNS rules)

@BenTheElder
Copy link
Member

Docker might have actually dropped the dport themselves at some point, given that they probably don't expect other traffic on 127.0.0.11 typically, but in that case I'd argue this is a bug on their end and we should fix it there.

I did a bit of digging and haven't turned up anything though, and in #15578 (comment) it appears that docker, containerd are identical but iptables versions are different.

@prezha
Copy link
Contributor

prezha commented Jan 6, 2023

@mnk i was wrong - haven't read your initial problem statement carefully, so was jumping to a conclusion; sorry about that!

i've looked at it a bit and i think that the problem is in using iptables-nft (which i think is default for fedora37) instead of iptables-legacy (i also remember reading in one of the kubernetes issues recently that nft is not [yet] supported, but i don't have a reference at hand atm)

i suggest you try with iptables-legacy instead - here's what i did:

$ sudo dnf install iptables-legacy

$ sudo update-alternatives --config iptables => select '/usr/sbin/iptables-legacy'

$ iptables --version
iptables v1.8.8 (legacy)

$ sudo reboot

$ minikube start
...

$ minikube ssh -- sudo iptables-save
# Generated by iptables-save v1.8.4 on Fri Jan  6 22:37:48 2023
*nat
...
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:41653
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:53598
...

$ minikube ssh ssh $(id -un)@192.168.49.1
The authenticity of host '192.168.49.1 (192.168.49.1)' can't be established.
ECDSA key fingerprint is SHA256:Y8jQ23KJ8Oy+H2e9eXDpttirqcg42g7HVtg4ZFjVgHM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.49.1' (ECDSA) to the list of known hosts.
prezha@192.168.49.1's password: 
Web console: https://localhost:9090/ or https://192.168.122.198:9090/

Last login: Fri Jan  6 22:35:54 2023 from 192.168.122.1

@prezha
Copy link
Contributor

prezha commented Jan 6, 2023

btw, i've also compared docker versions that are installed on a fresh ubuntu 20.04.5 (where iptables-legacy is default and this is working) and fedora 37 (where iptables-nft are default and apparently is not working), and they use identical versions/commits:

ubuntu 20.04.5:

prezha@minikube-test:~$ docker version
Client: Docker Engine - Community
 Version:           20.10.22
 API version:       1.41
 Go version:        go1.18.9
 Git commit:        3a2c30b
 Built:             Thu Dec 15 22:28:08 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.22
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.9
  Git commit:       42c8b31
  Built:            Thu Dec 15 22:25:58 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

fedora37:

[prezha@localhost ~]$ docker version
Client: Docker Engine - Community
 Version:           20.10.22
 API version:       1.41
 Go version:        go1.18.9
 Git commit:        3a2c30b
 Built:             Thu Dec 15 22:28:45 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.22
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.9
  Git commit:       42c8b31
  Built:            Thu Dec 15 22:26:25 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

@prezha
Copy link
Contributor

prezha commented Jan 6, 2023

looks like there were some efforts to support automatic iptables "mode" detection, but it seems that it's not working correctly in this case and this comment from Tim Hockin is a bit old but perhaps still relevant

@prezha
Copy link
Contributor

prezha commented Jan 6, 2023

looks like minikube user(s) reported identical problem earlier: #14631 (comment) whereas original problem refers to iptables-nft

so, if this switch to iptables-legacy is a working solution for @mnk, perhaps we add detection in minikube so if we see iptables-nft, we warn the user that "things might not work as expected"

@BenTheElder
Copy link
Member

to kubernetes-sigs/iptables-wrappers#3, but it seems that it's not working correctly in this case and this comment from Tim Hockin is a bit old but perhaps still relevant

That's not quite related. That change in Kubernetes is related to automatic mode detection in the kube-proxy image, in a "normal" environment by looking for rules setup by kubelet on the host. In our case we instead detect based on docker injecting rules for the embedded DNS resolver using the host iptables legacy or nf_tables.

That tweet predates the detection logic entirely, which is itself a workaround for the problem of there not being a stable interface and distros switching between the two binaries / backends.

The upstream KIND entrypoint has the same detection logic as kube-proxy (prior to the trick looking at kubelet generated rules specifically).

My host really is using nf_tables 1.8.8 and I don't see --dport using either in the "node". So the issue is not mismatched nf_tables vs legacy, the problem is 1.8.7 vs 1.8.8 nf_tables.

And now that I've typed that, kubernetes/kubernetes#112477 is paging back into memory 🙃

$ minikube ssh -- sudo iptables-save
# Generated by iptables-save v1.8.4 on Fri Jan  6 22:37:48 2023

1.8.4 is really old, so that's a different problem for minikube specifically.

In both cases, 1.8.8 on the host is currently a problem. kube-proxy in Kubernetes 1.26 is also on 1.8.7 so updating to 1.8.8 in kind/minikube is probably not sufficient yet.

Downgrading to 1.8.7 on the host is one workaround. Switching to legacy mode is another. Both work around the mismatched versions rather than legacy vs nf_tables detection.

@BenTheElder
Copy link
Member

BenTheElder commented Jan 7, 2023

This is also just one incompatibility between 1.8.8 and 1.8.7, we're going to have more problems when we do upgrade kubernetes/kind/... to > 1.8.7 (see above issue with --mark)

@prezha
Copy link
Contributor

prezha commented Jan 7, 2023

thanks for sharing the details @BenTheElder !
that's an interesting conversation between Dan (kubernetes) and Phil (netfilter)
so, iptables-nft v1.8.8 introduced a breaking change, and there are no plans to "fix" that, and the workaround atm is to:

  • stick with v1.8.7 (that's also used in kube-proxy) - both nft and legacy mode should work, or
  • use iptables (even v1.8.8) in legacy mode (that some linux distros still keep as default)

looks like the full transition from legacy to nft is going to be fun and not so quick

@mnk
Copy link
Author

mnk commented Jan 8, 2023

Yes, as @BenTheElder mentions, this indeed seems to be another case of incompatibility between 1.8.8 and 1.8.7. Without involving minikube, the problem can bee seen by:

$ docker network create --driver bridge test-net
$ docker run -it --privileged --network test-net fedora:37 bash
$ dnf install iptables-nft nftables
$ iptables-nft-save 
# Generated by iptables-nft-save v1.8.8 (nf_tables) on Sat Jan  7 17:44:56 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:41759
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:43231
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -m tcp --sport 41759 -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -m udp --sport 43231 -j SNAT --to-source :53
COMMIT
$ exit
$ docker run -it --privileged --network test-net fedora:36 bash
$ dnf install iptables-nft nftables
$ iptables-nft-save 
# Generated by iptables-nft-save v1.8.7 on Sat Jan  7 18:31:05 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -j DNAT --to-destination 127.0.0.11:38683
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -j DNAT --to-destination 127.0.0.11:57275
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -j SNAT --to-source :53
COMMIT

Both 1.8.8 and 1.8.7 does however produce similar output from nft list ruleset:

table ip nat {
	chain DOCKER_OUTPUT {
		ip daddr 127.0.0.11 tcp dport 53 counter packets 0 bytes 0 dnat to 127.0.0.11:38683
		ip daddr 127.0.0.11 udp dport 53 counter packets 35 bytes 2730 dnat to 127.0.0.11:57275
	}
...

I guess that means that the docker rules are fine until minikube starts patching them with iptables-save | iptables-restore?

Would it be possible for minikube/kind to just remove the docker rules and then create the needed rules from scratch?

@aojea
Copy link
Member

aojea commented Jan 8, 2023

This works fine (on my VM) 😄 kubernetes-sigs/kind#3059

@BenTheElder
Copy link
Member

I guess that means that the docker rules are fine until minikube starts patching them with iptables-save | iptables-restore?
Would it be possible for minikube/kind to just remove the docker rules and then create the needed rules from scratch?

So KIND doesn't use save+restore itself, but kube-proxy in Kubernetes works this way for good reasons (reconciling a large set of rules) and there's a third version of iptables in the kube-proxy image. I suspect the same for minikube.

On your host with minikube it's 1.8.8 nf_tables, 1.8.4 ?, and then 1.8.7 ? (kube-proxy in 1.25). In KIND it will be 1.8.8 nf_tables (host), 1.8.7 nf_tables (node), 1.8.7 nf_tables (kube-proxy). There's actually a fourth for CNI ... but that's generally kube-proxy matching more or less.

Discussing with @aojea and kubernetes-sigs/kind#3059 how we might work around this.

One thought is multi-version selecting the binaries and attempting to detect what the host is using.

Another is the workaround in kubernetes-sigs/kind#3059 combined with making sure at least kube-proxy + kindnetd + kind node match. Which is a trickier proposition for additional CNIs minikube may support.

@BenTheElder
Copy link
Member

BenTheElder commented Apr 7, 2023

So KIND doesn't use save+restore itself, but kube-proxy in Kubernetes works this way for good reasons (reconciling a large set of rules) and there's a third version of iptables in the kube-proxy image. I suspect the same for minikube.

Closing the loop: that's completely backwards :-)

kind does iptables-save | sed | iptables-restore to modify the docker dns rules, and that part is included in the kicbase image.

kube-proxy avoids save | mutate | restore

@BenTheElder
Copy link
Member

kubernetes-sigs/kind#3054 tracks the KIND fix which wound up compromising on ~#15578

@linux019
Copy link

linux019 commented Jun 5, 2023

#14631 (comment)
this one may help

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 21, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 20, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 21, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants