An XDP-based IP forwarding system with per-CPU direct-mapped caching, designed to evaluate cached FIB lookup scaling properties.
XDP Program (per-packet)
┌────────────────────────────────────────────────────────────────┐
│ 1. Parse IPv4 dst_ip │
│ 2. Lookup per-CPU direct-mapped cache (/32) ──hit──► forward │
│ │ miss (slot empty or different dst_ip) │
│ ▼ │
│ 3. Lookup global LPM trie (prefixes) │
│ │ │
│ ▼ │
│ 4. Cache result (overwrite slot), rewrite MACs, XDP_REDIRECT │
└────────────────────────────────────────────────────────────────┘
make deps # Install Go dependencies + bpf2go
make generate # Generate eBPF Go bindings
make build # Build bin/fibctlRequirements: Go 1.21+, clang, Linux 5.x+ with BPF support.
# 1. Prepare routes (rewrite next-hops for hairpin testing)
fibctl fib-rewrite data/full_fib.txt routes.txt --next-hop 192.168.1.1
# 2. Load XDP program
sudo fibctl load -i eth0
# 3. Import routes
sudo fibctl import routes.txt
# 4. Generate test traffic
fibctl pcap-gen routes.txt traffic.pcap --flow-size 10000 --packets 10000000
# 5. Run measurement (moongen, tcpreplay, etc.)
# ...
# 6. Watch stats
sudo fibctl stats -w
# 7. Cleanup
sudo fibctl unload| Command | Description |
|---|---|
fibctl load -i <iface> [-m native|generic] |
Load XDP and attach to interface |
fibctl unload |
Detach and remove all state |
fibctl info |
Show FIB info (routes, cache status, hit rate) |
| Command | Description |
|---|---|
fibctl import <file> |
Bulk import routes from file |
fibctl add <prefix> <next-hop> |
Add single route |
fibctl remove <prefix> |
Remove route |
fibctl reset |
Clear all routes, cache, and stats |
Route file format:
# Comments start with #
10.0.0.0/8 192.168.1.1
172.16.0.0/12 192.168.1.2
| Command | Description |
|---|---|
fibctl cache enable |
Enable per-CPU caching |
fibctl cache disable |
Disable caching (baseline measurement) |
fibctl cache invalidate |
Clear all cache entries |
fibctl cache status |
Show cache status |
| Command | Description |
|---|---|
fibctl stats |
Show current stats |
fibctl stats -w |
Watch continuously (prints every 100k pkts) |
fibctl stats --per-cpu |
Show per-CPU breakdown |
fibctl reset-stats |
Zero all counters |
| Command | Description |
|---|---|
fibctl fib-rewrite <in> <out> --next-hop <ip> |
Rewrite all next-hops (for hairpin) |
fibctl pcap-gen <fib> <out.pcap> [options] |
Generate test PCAP |
PCAP generation options:
--flow-size N # Number of unique destination IPs (default: 1000)
--packets N # Total packets to generate (default: 1000000)
--dist uniform # Distribution: uniform or zipf
--zipf-s 1.5 # Zipf skew parameter (higher = more skewed)
--src-ip IP # Source IP (default: 10.0.0.1)
--src-mac MAC # Source MAC (default: 00:00:00:00:00:01)
--dst-mac MAC # Destination MAC (default: 00:00:00:00:00:02)
# Configure NIC queues = number of CPUs
ethtool -L eth0 combined $NUM_CPUS
# Configure RSS to hash on dst IP only
ethtool -N eth0 rx-flow-hash udp4 dsudo fibctl load -i eth0
sudo fibctl import routes.txt
sudo fibctl cache disable
sudo fibctl reset-stats
# Run traffic, record throughput
sudo fibctl statssudo fibctl cache enable
sudo fibctl reset-stats
# Run traffic, record throughput
sudo fibctl statsRepeat with increasing CPU counts:
for cpus in 1 2 4 8 16; do
# Adjust NIC queues
ethtool -L eth0 combined $cpus
# Pin traffic generator to use $cpus cores
# Run test, collect stats
doneDefault values (compile-time, in bpf/fib.c):
| Parameter | Default | Description |
|---|---|---|
fib_trie.max_entries |
10M | Max routes in LPM trie |
CACHE_SIZE |
64K | Direct-mapped cache slots per CPU |
tx_ports.max_entries |
256 | Max redirect interfaces |
To change CACHE_SIZE, edit the -DCACHE_SIZE=65536 value in internal/fib/types.go
and update the cacheSize constant in internal/fib/cache.go to match.
BPF pin path: /sys/fs/bpf/fibctl (override with -p)
| Map | Type | Purpose |
|---|---|---|
fib_trie |
LPM_TRIE | Global FIB (longest prefix match) |
fib_cache |
PERCPU_HASH | Per-CPU direct-mapped /32 cache |
stats_map |
PERCPU_ARRAY | Per-CPU counters |
config_map |
ARRAY | Runtime config (cache enable/disable) |
tx_ports |
DEVMAP | Redirect target interfaces |
The cache uses a direct-mapped design with PERCPU_HASH:
- Key: Slot index computed as
dst_ip % CACHE_SIZE - Value:
cache_entrycontainingdst_ip(for collision detection) + forwarding info - Hit: Slot exists AND stored
dst_ipmatches packet'sdst_ip - Miss: Empty slot OR
dst_ipmismatch → LPM trie lookup → overwrite slot
This is simpler than LRU (no eviction tracking overhead) but susceptible to cache pollution from IP collisions. Works well when traffic has good locality.
When adding routes, fibctl automatically resolves:
- Output interface: via
netlink.RouteGet(next_hop) - Source MAC: from output interface
- Destination MAC: via ARP cache lookup (triggers resolution if needed)
Any trie modification (add/remove/import) invalidates the entire cache to ensure consistency
(LPM changes can affect any cached /32). Invalidation iterates all CACHE_SIZE slots and
deletes them.
GPL-2.0 OR BSD-3-Clause (eBPF program), MIT (Go code)