Skip to content

Use libpcap-filter for packet filtering #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 30, 2023

Conversation

jschwinger233
Copy link
Member

@jschwinger233 jschwinger233 commented Jun 22, 2023

This is a POC PR to demonstrate how to introduce libpcap for pwru, will clean this shit up after some detail discussions.

  1. Statically link for libpcap. It's feasible, as long as we compile the libpcap.a with some specific compilation flags, which can be integrated into release flow. This PR doesn't implement this for now, but we won't be bothered by the linking trouble.
  2. Generating filter ebpf instructions. 1) libpcap API compiles pcap-filter expression into cbpf; 2) cloudflare/cbpfc compiles cbpf into ebpf; 3) we, pwru, translate direct skb fields accessing into bpf_probe_read_kernel. The last step is very subtle and this PR still leaves some issue to make it work perfectly, see the TODO@gray in the PR.
  3. Inject generated ebpf instructions into existing instructions. To achieve this task, I add a bpf_printk in the bpf c code, which is exactly the position I will inject the filter instructions. Adding bpf_printk is good because: 1) it tells us the location of skb->data (L3 header pointer) and skb->data_end (...), which is necessary for cloudflare/cbpfc to do the conversion; 2) it leaves R1, R2, R3, R4 available for cloudflare/cbpfc conversion; 3) it's a clear signal to let us find the injection point: we just need to find the instruction calling bpf_printk; Still there are error-prone parts: we have to adjust offsets for some jump instructions.
  4. For the legacy filter flags such as --filter-dst-ip, we can make it part of the pcap-filter expression. For example, if user run pwru with --filter-dst-ip 1.2.3.4 --pcap-filter 'tcp and dst port 80', we can compose a new pcap filter expression: tcp and dst port 80 and dest host 1.2.3.4. So all the existing filtering code in bpf c becomes obsolete. However, this PR hasn't done this so far, but the point is we won't break the CLI interface.

Introducing tailcall or similar mechanism to decouple the generated instructions from the existing part is valuable, but I decided to implement this POC first to see if there is some hidden obstacle.

Current implementation is buggy and dirty, full of debugging logs, but I can improve it later if the ideas look good to @brb
I know my explanation is still vague, feel free to ask anything.

This PR allows pwru to filter traffic using pcap-filter(7) expression. For example, we can run pwru --filter-pcap 'ip6[40+13] & 0x3f = 0x2' to capture Syn TCP on IPv6.

The main ideas to achieve the goal are:

  1. Link libpcap, use its API to compile pcap-filter expression to cbpf
  2. Use cloudflare/cbpfc to convert cbpf to ebpf
  3. Adjust generated ebpf instructions to pass the verifier
  4. Inject the instructions to a specific position

Although the steps sound superficial and simple, the implementation details require meticulousness. See the comments in the libpcap package.

The original CLI flags won't be broken, as we converted them into part of pcap-filter expression.

Fixes: #85
Fixes: #142

Signed-off-by: Zhichuan Liang gray.liang@isovalent.com

@brb
Copy link
Member

brb commented Jun 23, 2023

Very nice!

Answers to the points above:

  1. Yep, we can embed libpcap.a at a later stage.
  2. My main worry about relying on cloudflare/cbpfc is that it introduces a runtime dependency on Clang (or did I completely misread their README?)
  3. SGTM.
  4. Yep, I'd bother about the conversion for now.
    5 (the tail call). Yep, we can do it later. Either a tail call or freplace could do the job.

@brb
Copy link
Member

brb commented Jun 23, 2023

My main worry about relying on cloudflare/cbpfc is that it introduces a runtime dependency on Clang (or did I completely misread their README?)

@jibi instructed me that it's only for C compilation. So please ignore my concern.

@brb brb mentioned this pull request Jun 23, 2023
@jschwinger233
Copy link
Member Author

Functionality should be fine now, next I can clean the code up and make it a readable PR.

@brb
Copy link
Member

brb commented Jun 26, 2023

Functionality should be fine now, next I can clean the code up and make it a readable PR.

👍 Next release is going to be crazy good.

@jschwinger233 jschwinger233 force-pushed the gray/libpcap-filter branch 8 times, most recently from 08a3b6b to e33d3f5 Compare June 28, 2023 10:42
@jschwinger233
Copy link
Member Author

jschwinger233 commented Jun 28, 2023

@brb ready for review but don't merge until I fix the release flow😂and docs

@jschwinger233 jschwinger233 changed the title [POC] Use libpcap-filter for packet filtering Use libpcap-filter for packet filtering Jun 28, 2023
@@ -1,5 +1,5 @@
GO := go
GO_BUILD = CGO_ENABLED=0 $(GO) build
GO_BUILD = CGO_ENABLED=1 $(GO) build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add -ldflags "-linkmode 'external' -extldflags '-static'" to get a statically linked exec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the libpcap/compile.go, so go build without ldflags can generate statically linked elf:

/*
#cgo LDFLAGS: -L/usr/local/lib -lpcap -static
#include <stdlib.h>
#include <pcap.h>
*/
import "C"

Or it's better if we move the flags to go build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that -static in #cgo forces building a statically linked go exec. It's fine to keep it here.

@@ -24,6 +24,10 @@ jobs:
go mod vendor
go mod verify
test -z "$(git status --porcelain)" || (echo "please run 'go mod tidy && go mod vendor', and submit your changes"; exit 1)
sudo apt-get install -y curl unzip gcc flex bison make
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add this step to Makefile under libpcap.a target?

@jschwinger233 jschwinger233 force-pushed the gray/libpcap-filter branch 3 times, most recently from c5cac72 to 7383ac9 Compare June 30, 2023 04:46
@jschwinger233 jschwinger233 marked this pull request as ready for review June 30, 2023 04:55
Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Also, thanks a lot for code comments which made the changes easier to understand.

Some minor nits.

type pcapBpfProgram C.struct_bpf_program

const (
MaxBpfInstructions = 4096
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this 1M for kernels we support?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, but this is for compiling pcap-filter to cbpf, which is concise and I consider 4096 is sufficient so that we don't have to waste unnecessary memory.

For instance, we can implement equivalent filter to pwru --filter-src-ip --filter-dst-ip --filter-src-port --filter-dst-port --filter proto --filter-tcp-flag syn using only 19 pieces of cbpf instructions:

$ tcpdump -d 'src host 1.1.1.1 and dst host 2.2.2.2 and src port 23 and dst port 23 and tcp and tcp[tcpfl
ags] & tcp-syn != 0'
Warning: assuming Ethernet
(000) ldh      [12]
(001) jeq      #0x800           jt 2	jf 19
(002) ld       [26]
(003) jeq      #0x1010101       jt 4	jf 19
(004) ld       [30]
(005) jeq      #0x2020202       jt 6	jf 19
(006) ldb      [23]
(007) jeq      #0x84            jt 19	jf 8
(008) jeq      #0x6             jt 9	jf 19
(009) ldh      [20]
(010) jset     #0x1fff          jt 19	jf 11
(011) ldxb     4*([14]&0xf)
(012) ldh      [x + 14]
(013) jeq      #0x17            jt 14	jf 19
(014) ldh      [x + 16]
(015) jeq      #0x17            jt 16	jf 19
(016) ldb      [x + 27]
(017) jset     #0x2             jt 18	jf 19
(018) ret      #262144
(019) ret      #0


// Append instructions to implement "exit immediately if not matched"
insts = append(insts,
asm.Mov.Imm(asm.R0, 0).WithSymbol("result"), // r0 = TC_ACT_OK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/TC_ACT_OK/0/


injectIdx := 0
for idx, inst := range program.Instructions {
// In the kprobe_pwru.c, we deliberately put a bpf_printk call to mark the injection position, see the comments over there.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GL to folks who will use bpf_printk for debugging :-) Maybe let's add a warning as a code comment in the beginning of kprobe_pwru.c?

4. Inject the instructions
*/
func InjectFilter(program *ebpf.ProgramSpec, filterExpr string) (err error) {
if filterExpr == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If empty filter, then we will end up calling the bpf_printk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will revise, thanks for pointing out 👍


// If there already is a reference and corresponding
// symbol, we don't have to create new symbol, just set -1
// to the offset so that cilium/ebpf loader can adjust it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about this loader's feature 🤩

@@ -34,6 +34,7 @@ type Flags struct {
FilterDstPort uint16
FilterPort uint16
FilterTrackSkb bool
FilterPcap string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to drop all previous filters, and allow users to specify the filter pcap in the same way as with tcpdump. For example, pwru --output-tuple 'host 1.1.1.1 and port 80'. Let's do it as a follow-up.

@brb brb mentioned this pull request Jun 30, 2023
We'll inject ebpf instructions compiled from pcap-filter expression and
leave filter_l3_and_l4 obsolete.

bpf_printk and `return data < data_end` are placeholders for further
instruction injection, please find more details in the comments.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
The new libpcap pkg mainly provides two functions:

1. Compile pcap-filter expression to ebpf instructions:
 a. Libpcap compiles filter expression to cbpf
 b. Cloudflare/cbpfc converts cbpf to ebpf
 c. We adjust the generated ebpf to pass verifier

2. Inject filter ebpf instructions:
 a. Find the injection position and needed registers
 b. Make preparation: adjust jump offsets
 c. Inject filter ebpf

Find more details in the code comments.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
For example, users can run commends like:

pwru --filter-pcap 'icmp and host 172.17.0.1'
pwru --filter-pcap 'tcp and host 172.17.0.1 and dst port 80'
pwru --filter-pcap 'tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420'
pwru --filter-pcap 'tcp[tcpflags] = (tcp-syn) and host 172.17.0.1 and dst port 2376'

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
As we no long filter skb according to the bpf configuration, this commit
delete all the `Filter*` member in struct config.

In userspace, the legacy `--filter-*` flags will all be translated into
pcap filter expression. For example, `--filter-dst-ip 172.17.0.1
--filter-port 80` will be translated to an expression `dst host
172.17.0.1 and port 80`.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
1. Turn on ENABLE_CGO because of libpcap dependency
2. Add `libpcap.a` target in Makefile
3. Switch from alpine image to common image, bacause libpcap can't be
   compiled with musl
4. Use apt instead of apk to install toolchains
4. Add a job to compile libpcap.a in the github action for test

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IPv6 L4 Protocol support --filter-ip and --filter-port
2 participants