Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k3s is not working with IPv6 or dualstack setting #3578

Closed
frankui opened this issue Jul 6, 2021 · 7 comments
Closed

k3s is not working with IPv6 or dualstack setting #3578

frankui opened this issue Jul 6, 2021 · 7 comments

Comments

@frankui
Copy link

frankui commented Jul 6, 2021

Environmental Info:
K3s Version:

k3s version v1.21.2+k3s1 (5a67e8d)
go version go1.16.4

Node(s) CPU architecture, OS, and Version:

Linux ipv6invv.local 5.8.0-50-generic #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

1 sever

IPv6 single-stack k3s system service configuration:

mystic@ipv6invv:~/k3s_install$ cat /etc/systemd/system/k3s.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=servicelb,traefik --disable-agent --disable-network-policy --cluster-cidr fd00:db8:0:0:0:0:1::/112 --service-cidr fd00:db8:0:0:0:0:2::/112 --node-ip 2001:7788:9999::254
mystic@ipv6invv:~/k3s_install$

Dual-stack k3s system service configuration:

mystic@ipv6invv:~/k3s_install$ cat /etc/systemd/system/k3s.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=servicelb,traefik --disable-agent --disable-network-policy --cluster-cidr 10.42.0.0/16,fd00:db8:0:0:0:0:1::/112 --service-cidr 10.43.0.0/16,fd00:db8:0:0:0:0:2::/112 --node-ip 2001:7788:9999::254
mystic@ipv6invv:~/k3s_install$

Describe the bug:

For IPv6 single-stack, k3s service fails to be active with below log(cannot configure IPv4 cluster-cidr: no IPv4 CIDRs found):

...
Jul 06 18:03:39 ipv6invv.local systemd[1]: Starting Lightweight Kubernetes...
-- Subject: A start job for unit k3s.service has begun execution
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A start job for unit k3s.service has begun execution.
--
-- The job identifier is 466684.
Jul 06 18:03:39 ipv6invv.local sh[526668]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Jul 06 18:03:39 ipv6invv.local sh[526669]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Jul 06 18:03:39 ipv6invv.local k3s[526672]: time="2021-07-06T18:03:39.317067518+08:00" level=fatal msg="cannot configure IPv4 cluster-cidr: no IPv4 CIDRs found"
Jul 06 18:03:39 ipv6invv.local systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
...

For dualstack, k3s system service can be active but process crashed with below log(Controller: Invalid --cluster-cidr, mask size of cluster CIDR must be less than or equal to --node-cidr-mask-size configured for CIDR family):

...
Jul 06 18:08:09 ipv6invv.local k3s[527571]: F0706 18:08:09.958990  527571 node_ipam_controller.go:110] Controller: Invalid --cluster-cidr, mask size of cluster CIDR must be less than or equal to --node-cidr-mask-size configured for CIDR family
Jul 06 18:08:10 ipv6invv.local k3s[527571]: goroutine 4608 [running]:
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.stacks(0xc000128001, 0xc00f208680, 0xc8, 0x191)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:1021 +0xb9
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.(*loggingT).output(0x7c10040, 0xc000000003, 0x0, 0x0, 0xc0079c6000, 0x641a308, 0x17, 0x6e, 0x40fb00)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:970 +0x191
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.(*loggingT).printDepth(0x7c10040, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0x1, 0xc007251a40, 0x1, 0x1)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:733 +0x16f
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.(*loggingT).print(...)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:715
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/klog/v2.Fatal(...)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/v2/klog.go:1489
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/controller/nodeipam.NewNodeIpamController(0x53dee10, 0xc000de3a10, 0x0, 0x0, 0x5493698, 0xc009151a20, 0xc007237910, 0x2, 0x2, 0x0, ...)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/controller/nodeipam/node_ipam_controller.go:110 +0x37b
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app.startNodeIpamController(0x5446100, 0xc010c40600, 0x5482da8, 0xc008cb44b0, 0x53e0f08, 0xc001404220, 0x0, 0x0, 0x0, 0x0, ...)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/core.go:178 +0x3a5
Jul 06 18:08:10 ipv6invv.local k3s[527571]: github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app.StartControllers(0x5446100, 0xc010c40600, 0x5482da8, 0xc008cb44b0, 0x53e0f08, 0xc001404220, 0x0, 0x0, 0x0, 0x0, ...)
Jul 06 18:08:10 ipv6invv.local k3s[527571]:         /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:560 +0x3d4
...

Steps To Reproduce:

  • Installed K3s: Install k3s with IPv6 or dualstack offline.

Expected behavior:

K3s installation succeeds and k3s service is active and node can be ready.

Actual behavior:

k3s linux system service is not active for IPv6 single stack configuration.
k3s linux system service is active for dualstack but k3s related process crash is observed and the cluster is not working at all.

mystic@ipv6invv:~/k3s_install$ ps -ef | grep k3s
mystic    529324  505286  0 18:19 pts/3    00:00:00 grep --color=auto k3s
mystic@ipv6invv:~/k3s_install$
mystic@ipv6invv:~/k3s_install$ sudo kubectl get namespaces
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
mystic@ipv6invv:~/k3s_install$ 

Additional context / logs:

@frankui frankui changed the title k3s server is not working with IPv6 or dualstack setting k3s is not working with IPv6 or dualstack setting Jul 6, 2021
@brandond
Copy link
Member

brandond commented Jul 6, 2021

Flannel does not support IPV6, either single or dual-stack. If you want to use IPv6, you must disable both flannel and the network policy controller. See: #3212

@brandond brandond closed this as completed Jul 6, 2021
@frankui
Copy link
Author

frankui commented Jul 7, 2021

Flannel does not support IPV6, either single or dual-stack. If you want to use IPv6, you must disable both flannel and the network policy controller. See: #3212

@brandond I hava already added below params(--flannel-backend=none, --disable-network-policy) when starting k3s, or it will remind flannel and network policy controller that need to be disabled in the log. The issues post here are actually with those params configured in the k3s system service. Any idea about this?
For IPv6 single-stack:

ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=servicelb,traefik --disable-agent --disable-network-policy --cluster-cidr fd00:db8:0:0:0:0:1::/112 --service-cidr fd00:db8:0:0:0:0:2::/112 --node-ip 2001:7788:9999::254

For dual-stack:

ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=servicelb,traefik --disable-agent --disable-network-policy --cluster-cidr 10.42.0.0/16,fd00:db8:0:0:0:0:1::/112 --service-cidr 10.43.0.0/16,fd00:db8:0:0:0:0:2::/112 --node-ip 2001:7788:9999::254

@brandond
Copy link
Member

brandond commented Jul 7, 2021

Upstream Kubernetes doesn't support single stack ipv6 yet. The in-cluster apiserver service for example only supports ipv4 endpoints. You might be able to get workload services to work with ipv6 only but as far as I know the cluster as a whole can only be dual-stack at best.

@frankui
Copy link
Author

frankui commented Jul 7, 2021

Upstream Kubernetes doesn't support single stack ipv6 yet. The in-cluster apiserver service for example only supports ipv4 endpoints. You might be able to get workload services to work with ipv6 only but as far as I know the cluster as a whole can only be dual-stack at best.

@brandond Thanks for the comments, Brad. How about the dual-stack? I saw the log "Jul 06 18:08:09 ipv6invv.local k3s[527571]: F0706 18:08:09.958990 527571 node_ipam_controller.go:110] Controller: Invalid --cluster-cidr, mask size of cluster CIDR must be less than or equal to --node-cidr-mask-size configured for CIDR family" observed with dual-stack and then the process crashed. It seems the cluster-cidr "fd00:db8:0:0:0:0:1::/112" I configured may have too big mask size and we need to make sure the size is equal and less than "--node-cidr-mask-size" but I can't find k3s provides "--node-cidr-mask-size" setting at all. Any idea about this?

mystic@ipv6invv:~/k3s_install$ /usr/local/bin/k3s server --help | grep -i ipv6
   --tls-san value                            (listener) Add additional hostnames or IPv4/IPv6 addresses as Subject Alternative Names on the server TLS cert
   --cluster-cidr value                       (networking) IPv4/IPv6 network CIDRs to use for pod IPs (default: 10.42.0.0/16)
   --service-cidr value                       (networking) IPv4/IPv6 network CIDRs to use for service IPs (default: 10.43.0.0/16)
   --node-ip value, -i value                  (agent/networking) IPv4/IPv6 addresses to advertise for node
   --node-external-ip value                   (agent/networking) IPv4/IPv6 external IP addresses to advertise for node
mystic@ipv6invv:~/k3s_install$ /usr/local/bin/k3s server --help | grep -i mask-size
mystic@ipv6invv:~/k3s_install$

@frankui
Copy link
Author

frankui commented Jul 7, 2021

Flannel does not support IPV6, either single or dual-stack. If you want to use IPv6, you must disable both flannel and the network policy controller. See: #3212

@brandond I hava already added below params(--flannel-backend=none, --disable-network-policy) when starting k3s, or it will remind flannel and network policy controller that need to be disabled in the log. The issues post here are actually with those params configured in the k3s system service. Any idea about this?
For IPv6 single-stack:

ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=servicelb,traefik --disable-agent --disable-network-policy --cluster-cidr fd00:db8:0:0:0:0:1::/112 --service-cidr fd00:db8:0:0:0:0:2::/112 --node-ip 2001:7788:9999::254

For dual-stack:

ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=servicelb,traefik --disable-agent --disable-network-policy --cluster-cidr 10.42.0.0/16,fd00:db8:0:0:0:0:1::/112 --service-cidr 10.43.0.0/16,fd00:db8:0:0:0:0:2::/112 --node-ip 2001:7788:9999::254

After changing the cluster-cidr from fd00:db8:0:0:0:0:1::/112 to fd42::/48 and the service-cidr from fd00:db8:0:0:0:0:2::/112 to fd43::/112 as what you used in #3212. The dual-stack crash issue is gone. Interesting thing to me. Anyway, let'me try more test with the dual-stack setting and thanks for the help on this.

@brandond
Copy link
Member

brandond commented Jul 7, 2021

fd00:db8:0:0:0:0:1::/112 and fd00:db8:0:0:0:0:2::/112 are both IPv6 addresses, not networks - they have a host bit set. They're also in the same network. --node-cidr-mask-size is a kube-controller-manager arg that doesn't have a top-level K3s flag, you would need to use the arg passthrough - but I would probably not change that, in favor of just using appropriately sized cluster and service cidrs.

@frankui
Copy link
Author

frankui commented Jul 8, 2021

fd00:db8:0:0:0:0:1::/112 and fd00:db8:0:0:0:0:2::/112 are both IPv6 addresses, not networks - they have a host bit set. They're also in the same network. --node-cidr-mask-size is a kube-controller-manager arg that doesn't have a top-level K3s flag, you would need to use the arg passthrough - but I would probably not change that, in favor of just using appropriately sized cluster and service cidrs.

Both fd00:db8:0:0:0:0:1::/112 and fd00:db8:0:0:0:0:2::/112 are not IPv6 addresses but IPv6 cidr with mask size as 112 here. I double checked the kubernetes doc and it says the default node-netmask-size for IPv6 is 64 (k8s-dualstack). Since k3s is not providing this setting, it means that we need to set the cidr mask size to smaller than 64 but not 112 here. That seems to be the reason. Anyway, it works for now. Thanks again for your comments and help on this. @brandond

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants