Skip to content

flannel: set EnableNFTables when kube_proxy_mode use nftables#13291

Open
rtsui-harmonicinc wants to merge 1 commit into
kubernetes-sigs:masterfrom
rtsui-harmonicinc:master
Open

flannel: set EnableNFTables when kube_proxy_mode use nftables#13291
rtsui-harmonicinc wants to merge 1 commit into
kubernetes-sigs:masterfrom
rtsui-harmonicinc:master

Conversation

@rtsui-harmonicinc
Copy link
Copy Markdown

@rtsui-harmonicinc rtsui-harmonicinc commented Jun 3, 2026

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Testing RL10 deployment using flannel showed an issue.
Since iptables is no longer included in RL10. iptables mode kube-proxy would enter crash-loop and fail to start. However, iptables is still being used even when kube-proxy is set to nftables mode.

The iptables-nft shim is not capable for flannel needs.

Ansible Log:

TASK [network_plugin/flannel : Flannel | Wait for flannel subnet.env file presence] ***
fatal: [k8s-node-001]: FAILED! => {"changed": false, "elapsed": 600, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}
fatal: [k8s-node-002]: FAILED! => {"changed": false, "elapsed": 600, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}
fatal: [k8s-node-003]: FAILED! => {"changed": false, "elapsed": 600, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}

kube-proxy log(when using iptables mode on RL10):

[rocky@k8s-node-001 ~]$ k logs -n kube-system ds/kube-proxy
Found 3 pods, using pod/kube-proxy-2prqb
I0528 23:46:29.343902       1 shared_informer.go:370] "Waiting for caches to sync"
I0528 23:46:29.444858       1 shared_informer.go:377] "Caches are synced"
I0528 23:46:29.444891       1 server.go:218] "Successfully retrieved NodeIPs" NodeIPs=["10.10.1.200"]
I0528 23:46:29.449627       1 conntrack.go:57] "Setting nf_conntrack_max" nfConntrackMax=524288
E0528 23:46:29.449793       1 server.go:255] "Kube-proxy configuration may be incomplete or incorrect" err="nodePortAddresses is unset; NodePort connections will be accepted on all local IPs. Consider using `--nodeport-addresses primary`"
I0528 23:46:29.465596       1 server.go:264] "kube-proxy running in dual-stack mode" primary ipFamily="IPv4"
I0528 23:46:29.465641       1 server_linux.go:136] "Using iptables Proxier"
I0528 23:46:29.469797       1 proxier.go:242] "Setting route_localnet=1 to allow node-ports on localhost; to change this either disable iptables.localhostNodePorts (--iptables-localhost-nodeports) or set nodePortAddresses (--nodeport-addresses) to filter loopback addresses" ipFamily="IPv4"
E0528 23:46:29.472928       1 proxier.go:270] "Failed to create nfacct runner, nfacct based metrics won't be available" err="nfacct sub-system not available" ipFamily="IPv4"
E0528 23:46:29.474694       1 proxier.go:270] "Failed to create nfacct runner, nfacct based metrics won't be available" err="nfacct sub-system not available" ipFamily="IPv6"
I0528 23:46:29.474748       1 server.go:529] "Version info" version="v1.35.5"
I0528 23:46:29.474761       1 server.go:531] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
E0528 23:46:29.476462       1 metrics.go:379] "failed to initialize nfacct client" err="nfacct sub-system not available"
E0528 23:46:29.478048       1 metrics.go:379] "failed to initialize nfacct client" err="nfacct sub-system not available"
I0528 23:46:29.478827       1 config.go:106] "Starting endpoint slice config controller"
I0528 23:46:29.478847       1 shared_informer.go:349] "Waiting for caches to sync" controller="endpoint slice config"
I0528 23:46:29.478852       1 config.go:200] "Starting service config controller"
I0528 23:46:29.478866       1 shared_informer.go:349] "Waiting for caches to sync" controller="service config"
I0528 23:46:29.478870       1 config.go:403] "Starting serviceCIDR config controller"
I0528 23:46:29.478879       1 config.go:309] "Starting node config controller"
I0528 23:46:29.478881       1 shared_informer.go:349] "Waiting for caches to sync" controller="serviceCIDR config"
I0528 23:46:29.478887       1 shared_informer.go:349] "Waiting for caches to sync" controller="node config"
I0528 23:46:29.578972       1 shared_informer.go:356] "Caches are synced" controller="serviceCIDR config"
I0528 23:46:29.578991       1 shared_informer.go:356] "Caches are synced" controller="node config"
I0528 23:46:29.579006       1 shared_informer.go:356] "Caches are synced" controller="service config"
I0528 23:46:29.579057       1 shared_informer.go:356] "Caches are synced" controller="endpoint slice config"
E0528 23:46:29.662791       1 proxier.go:807] "Failed to ensure chain jumps" err=<
        error appending rule: exit status 4: Warning: Extension conntrack revision 0 not supported, missing kernel module?
        Warning: Extension comment revision 0 not supported, missing kernel module?
        iptables v1.8.9 (nf_tables):  RULE_INSERT failed (No such file or directory): rule in chain INPUT
 > ipFamily="IPv4" table="filter" srcChain="INPUT" dstChain="KUBE-EXTERNAL-SERVICES"
I0528 23:46:29.662830       1 proxier.go:770] "Sync failed" ipFamily="IPv4" retryingTime="30s"
E0528 23:46:29.670487       1 proxier.go:807] "Failed to ensure chain jumps" err=<
        error checking rule: exit status 2: Warning: Extension conntrack is not supported, missing kernel module?
        ip6tables v1.8.9 (nf_tables): Couldn't load match `conntrack':No such file or directory

        Try `ip6tables -h' or 'ip6tables --help' for more information.
 > ipFamily="IPv6" table="filter" srcChain="INPUT" dstChain="KUBE-EXTERNAL-SERVICES"
I0528 23:46:29.670511       1 proxier.go:770] "Sync failed" ipFamily="IPv6" retryingTime="30s

Log(when using nft mode but EnableNFTables not specified):

# kube-proxy (successfully running)
$ k -n kube-system logs ds/kube-proxy

Found 3 pods, using pod/kube-proxy-nv9jj
I0529 00:59:34.498180       1 shared_informer.go:370] "Waiting for caches to sync"
I0529 00:59:34.599309       1 shared_informer.go:377] "Caches are synced"
I0529 00:59:34.599361       1 server.go:218] "Successfully retrieved NodeIPs" NodeIPs=["10.10.1.243"]
I0529 00:59:34.604568       1 conntrack.go:57] "Setting nf_conntrack_max" nfConntrackMax=524288
I0529 00:59:34.604859       1 server.go:264] "kube-proxy running in dual-stack mode" primary ipFamily="IPv4"
I0529 00:59:34.604887       1 server_linux.go:253] "Using nftables Proxier"
I0529 00:59:34.649984       1 server.go:529] "Version info" version="v1.35.5"
I0529 00:59:34.650027       1 server.go:531] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0529 00:59:34.650923       1 config.go:106] "Starting endpoint slice config controller"
I0529 00:59:34.650937       1 shared_informer.go:349] "Waiting for caches to sync" controller="endpoint slice config"
I0529 00:59:34.650944       1 config.go:200] "Starting service config controller"
I0529 00:59:34.650956       1 shared_informer.go:349] "Waiting for caches to sync" controller="service config"
I0529 00:59:34.650969       1 config.go:309] "Starting node config controller"
I0529 00:59:34.650983       1 shared_informer.go:349] "Waiting for caches to sync" controller="node config"
I0529 00:59:34.650974       1 config.go:403] "Starting serviceCIDR config controller"
I0529 00:59:34.651034       1 shared_informer.go:349] "Waiting for caches to sync" controller="serviceCIDR config"
I0529 00:59:34.751398       1 shared_informer.go:356] "Caches are synced" controller="serviceCIDR config"
I0529 00:59:34.751438       1 shared_informer.go:356] "Caches are synced" controller="service config"
I0529 00:59:34.751460       1 shared_informer.go:356] "Caches are synced" controller="node config"
I0529 00:59:34.751474       1 shared_informer.go:356] "Caches are synced" controller="endpoint slice config"

# flannel(failing)
$ k -n kube-system logs ds/kube-flannel

Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
I0529 01:05:20.199384       1 main.go:226] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ipMasqRandomFullyDisable:false ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true blackholeRoute:false netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W0529 01:05:20.199466       1 client_config.go:667] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0529 01:05:20.261635       1 kube.go:537] Starting kube subnet manager
I0529 01:05:20.261669       1 kube.go:139] Waiting 10m0s for node controller to sync
I0529 01:05:21.261819       1 kube.go:163] Node controller sync successful
I0529 01:05:21.261843       1 main.go:252] Created subnet manager: Kubernetes Subnet Manager - k8s-node-003
I0529 01:05:21.261850       1 main.go:255] Installing signal handlers
I0529 01:05:21.262040       1 main.go:534] Found network config - Backend type: vxlan
E0529 01:05:21.262140       1 main.go:289] Failed to check br_netfilter: stat /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory
[rocky@k8s-node-001 ~]$

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Flannel use EnableNFTables when `kube_proxy_mode` is nftables

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 3, 2026
@k8s-ci-robot k8s-ci-robot requested review from ErikJiang and yankay June 3, 2026 19:15
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rtsui-harmonicinc
Once this PR has been reviewed and has the lgtm label, please assign ant31 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @rtsui-harmonicinc. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 3, 2026
@rtsui-harmonicinc rtsui-harmonicinc marked this pull request as ready for review June 3, 2026 19:24
Copilot AI review requested due to automatic review settings June 3, 2026 19:24
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 3, 2026
@k8s-ci-robot k8s-ci-robot requested review from VannTen and guoard June 3, 2026 19:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds flannel configuration toggles/comments to align with newer flannel releases (nftables support and an optional cache-readiness env var).

Changes:

  • Add EnableNFTables to flannel net-conf.json, derived from kube_proxy_mode.
  • Add commented documentation for CONT_WHEN_CACHE_NOT_READY (introduced in flannel v0.27.1).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"EnableIPv6": true,
"IPv6Network": "{{ kube_pods_subnet_ipv6 }}",
{% endif %}
"EnableNFTables": {{ (kube_proxy_mode == 'nftables') | bool | to_json }},
Comment on lines +104 to +106
# CONT_WHEN_CACHE_NOT_READY added in flannel v0.27.1
# - name: CONT_WHEN_CACHE_NOT_READY
# value: "false"
@yankay
Copy link
Copy Markdown
Member

yankay commented Jun 4, 2026

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants