Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use netlink interface directly (distroless) #136

Merged
merged 1 commit into from
Jan 10, 2025

Conversation

aojea
Copy link
Contributor

@aojea aojea commented Jan 7, 2025

This removes the dependency on the nft libraries reducing size and exposure to vulnerabilities and user space changes, netlink API seems to be more stable https://docs.kernel.org/networking/netlink_spec/nftables.html and I will add more test to ensure the user space representation does not drift google/nftables#292

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 7, 2025
@aojea aojea marked this pull request as draft January 7, 2025 15:25
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 7, 2025
@danwinship
Copy link

This removes the dependency on the nft libraries reducing size and exposure to vulnerabilities and user space changes

I don't think there are any arguments for porting kube-network-policies to use netlink that wouldn't also apply to kube-proxy, so if you are thinking it makes sense to switch kube-network-policies over then we should document why and then move kube-proxy over too.

@aojea
Copy link
Contributor Author

aojea commented Jan 7, 2025

I don't think there are any arguments for porting kube-network-policies to use netlink that wouldn't also apply to kube-proxy, so if you are thinking it makes sense to switch kube-network-policies over then we should document why and then move kube-proxy over too.

I've been playing a lot with the netlink nftables interfaces these christmas and the API seems more solid, but the ecosystem is very small and very low level and you have to go directly to the source code a lot of times and to the wire representation mdlayher/netlink#219 ...

kube-network-policies does not require much work, just 1 Sets and ~ 4 rules per ip family, and then adding and removing elements from sets, it is unlikely this project will require to add any new rules in the future

I think that we need just this to answer the question if is worthy it, that is why we put these projects in kubernetes-sigs to explore and provide feedback ... if does not work we can just revert back

@danwinship
Copy link

the API seems more solid

FWIW I've been thinking about the possibility of a v2 json API for nft to try to make it better for programmatic use...

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 10, 2025
@aojea aojea marked this pull request as ready for review January 10, 2025 20:56
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 10, 2025
kube-network-policies only need a specific set of nftables rules
to be present to filter the undesired traffic and enqueue the traffic
subject to inspection to userspace.

There is an optimization using sets to avoid diverting the traffic that
are not subject to policies to avoid the penalty of userspace, but
besides that there is no plans to require more interaction with
netfilter.

Using the netfilter interface with nftables is complex and very low
level, but since the netfilter interaction should not change much in the
foreseeble future, the removal of the dependency on the userspace tools
bring a big advantage in term of image size 72MB vs 92MB as today, and
n the maintanance of the image, since we only need to maintain the
golang binary.
@aojea
Copy link
Contributor Author

aojea commented Jan 10, 2025

FWIW I've been thinking about the possibility of a v2 json API for nft to try to make it better for programmatic use...

nftables' virtual machine (VM) is incredibly flexible, allowing for a nice number of packet processing operations. I love that I do not need to understand the skbuff struct to do "things" with the packets. However, this flexibility has a ton of complexity and I can't say how to better model it:

  • the netlink interface provides low-level access to nftables, the raw bytecode it exposes is difficult to work with directly, the mix of high-level commands (like masquerade and queue) and low-level operations (like cmp and bitwise) sometime does not work when combined in certain ways. The set functionality in nftables reminds me our Service API, powerful and nice but with a lot of options that makes it challenging to manage at the netlink level.
  • nft in the other side gives you a set of Lego blocks but sometimes you miss some of the lego pieces or just are really hard to reason about on how to fit them together or they just don't fit at all until a new version happen :/

I'm looking forward to your new API, IMHO for a new nftables API, the real challenge is on how to expose those powerful network operations in a programmer-friendly way, JSON is just the surface layer; the real issue to me is about the abstractions and programmability.

@aojea aojea changed the title Use netlink interface directly Use netlink interface directly (distroless) Jan 10, 2025
@aojea aojea merged commit 808dc01 into kubernetes-sigs:main Jan 10, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants