Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node local traffic is not processed by network policies #66

Merged
merged 2 commits into from
Jul 23, 2024

Conversation

aojea
Copy link
Contributor

@aojea aojea commented Jul 23, 2024

Fixes: #65

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 23, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 23, 2024
@aojea
Copy link
Contributor Author

aojea commented Jul 23, 2024

/assign @danwinship
/cc @uablrek @vaskozl

Dan please check this is aligned with admin network policies also

@k8s-ci-robot
Copy link
Contributor

@aojea: GitHub didn't allow me to request PR reviews from the following users: vaskozl.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/assign @danwinship
/cc @uablrek @vaskozl

Dan please check this is aligned with admin network policies also

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 23, 2024
tx.Add(&knftables.Rule{
Chain: chainName,
Rule: knftables.Concat(
"fib", "saddr", "type", "local", "accept"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit: you don't need to use knftables.Concat here; you can just write "fib saddr type local accept")

But that's not right; with some network plugins, all local pod IPs would be considered "type local" (because the IPs are visible on interfaces in the host netns).

I think the "standard" approach is to allow only from the primary and secondary node IP (learned either via the downward API, or by watching the Node object). Though actually, I don't think that's CNI-agnostic; in some cases, node-to-pod traffic will use the IP of the bridge that the pods are on (eg docker0) rather than the official node IP.

The super-clever option would be to figure out kubelet's cgroup ID and write a meta cgroupv2 ... rule. Although I think some people expect the non-clever "allow from the node's IP" version anyway.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was wrong with the iif lo approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have contributors willing to accept challenges , looking at @uablrek now :)

I like the idea of restricting the loophole to the kubelet, now that you explain it the "type local" sounds easy to cheat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was wrong with the iif lo approach?

the trace does not show the iif metadata on the postrouting hook, so I think that to make that work we should add a new rule in INPUT to mark the packets and then in POSTROUTING ignore those packets

cgroups

cat /proc/66685/cgroup
0::/kubelet.slice/kubelet.service

I don't know how stable or configurable is this option, @BenTheElder do you know?

There is also a skuid and skgid match
https://wiki.nftables.org/wiki-nftables/index.php/Matching_packet_metainformation

cat /proc/66685/status
Name:   kubelet
Umask:  0022
State:  S (sleeping)
Tgid:   66685
Ngid:   0
Pid:    66685
PPid:   1
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0

Maybe accepting everything that comes as root from the host is the best approach, kubelet hast to run as root and if you are root does not matter network policies, you can do whatever you want and access the pods directly on the namespaces

        chain trace_chain2 {
                type filter hook postrouting priority filter - 1; policy accept;
                meta skuid 0 counter packets 194 bytes 20003

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should go back to "type local". It's probably not "more unsafe" than anything existing, and to assume user 0 for all traffic I think will cause problems. People try very hard to not enforce the root user it seems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, maybe not "type local", but I think it's unsafe to assume that kubelet runs as root in all clusters, and will so forever

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then again, I have no better idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubelet has to do root operations creating cgroups per example, the probes need to open sockets ... it can be rootless https://kubernetes.io/docs/tasks/administer-cluster/kubelet-in-userns/ but in a rootless environment this project will not work too 😄 , the network there is 🕺

The network namespace of the Node components has to have a non-loopback interface, which can be for example configured with slirp4netns, VPNKit, or lxc-user-nic(1).

The network namespaces of the Pods can be configured with regular CNI plugins. For multi-node networking, Flannel (VXLAN, 8472/UDP) is known to work.

Ports such as the kubelet port (10250/TCP) and NodePort service ports have to be exposed from the Node network namespace to the host with an external port forwarder, such as RootlessKit, slirp4netns, or socat(1).

@BenTheElder how safe do you think it is assume that kubelet runs as root?

@danwinship
Copy link

Dan please check this is aligned with admin network policies also

ANP v1alpha1 does not provide any way to block node-to-pod traffic, so this doesn't affect ANP at all

@@ -168,8 +168,6 @@ run_tests() {
fi

# ginkgo regexes
SKIP="${SKIP:-"Feature|Federation|PerformanceDNS|DualStack|Disruptive|Serial|KubeProxy|GCE|Netpol|NetworkPolicy|256.search.list.characters|LoadBalancer.Service.without.NodePort|type.and.ports.of.a.TCP.service|loadbalancer.source.ranges"}"
FOCUS="${FOCUS:-"\\[sig-network\\]"}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

@aojea aojea Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have now LABEL_FILTER that is the new cool kid on the block kubernetes/test-infra#33075 ... you can even read the filter and understand it 😄

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 23, 2024
@uablrek
Copy link
Contributor

uablrek commented Jul 23, 2024

Applied this PR and tested:

Versions:
  16:03:02 Server Version: v1.30.3
  16:03:02 CNI-plugin; flannel:v0.24.2
  16:03:02 Proxy-mode: "nftables"
  16:03:02 crio version 1.29.2
  16:03:02 BaseFamily=IPv6
E2e:
export SKIP=NOTHING
export FOCUS=Netpol
Ran 50 of 6652 Specs in 145.227 seconds
SUCCESS! -- 50 Passed | 0 Failed | 0 Pending | 6602 Skipped

And my PODs with TCP probes stays alive.

Kubelet has to run as root, blocking the traffic directed to Pods
by network policies can impact kubelet probes per example.

Since root user can do anything on the node, network policies are
not a security boundary for it, so don't apply them to the traffic
generated by the root user.
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 23, 2024
@aojea
Copy link
Contributor Author

aojea commented Jul 23, 2024

pushed new version using the uid of root to discriminate the local packets

@uablrek
Copy link
Contributor

uablrek commented Jul 23, 2024

Tested the new version. Same result as #66 (comment) 👍

@uablrek
Copy link
Contributor

uablrek commented Jul 23, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2024
@k8s-ci-robot k8s-ci-robot merged commit 5d52862 into kubernetes-sigs:main Jul 23, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TCP liveness/readiness probes are blocked by network policies
4 participants