Rationale
The current codebase delivered outbound SLIRP networking including sendmsg routing and
shell-level TCP integration tests (wget under --net). Three gaps remain:
(a) Server sockets: bind() is implemented, but listen() and accept()/accept4() are not. The AF_UNIX socketpair bridge architecture does not support inbound connections -- each accepted connection would need a new socketpair + ADDFD injection into the tracee.
(b) sendmsg allow-list removal: sendmsg is BPF allow-listed because the supervisor uses SCM_RIGHTS for pre-exec FD passing. Guest sendmsg with msg_name (unconnected UDP destination) bypasses the supervisor entirely and the destination address is lost on the AF_UNIX socketpair. Control-message semantics are also affected.
(c) Dedicated TCP regression test: integration tests exercise TCP via shell-level wget, but no dedicated guest test binary validates the connect/send/recv path directly in CI.
Proposed Changes
Phase 1 -- TCP regression test (low risk, immediate CI value):
- Add
net-tcp-test.c guest binary: connect to a known TCP endpoint (e.g., SLIRP gateway), send request, verify response.
Phase 2 -- pidfd_getfd migration:
- Replace SCM_RIGHTS with
pidfd_getfd() (Linux 5.6+) or /proc/<pid>/fd/ for pre-exec FD passing, removing the need to BPF allow-list sendmsg.
- Intercept
sendmsg in the dispatcher, extract msg_name and control messages, forward to LKL for unconnected UDP and ancillary data.
Phase 3 -- Server sockets:
- Implement
forward_listen(): forward to LKL, register listening socket with SLIRP event loop.
- Implement
forward_accept()/forward_accept4(): on LKL accept, create new socketpair, register with event loop, inject into tracee via ADDFD.
- Add SLIRP port forwarding configuration (host:port -> guest:port mapping).
Considerations
- Phase 1 is self-contained and can land independently
- Phase 2 requires
pidfd_getfd (Linux 5.6+); verify this aligns with kbox's minimum supported kernel version
- Phase 3 is architecturally complex: N concurrent bridges per listening socket, accept queue backpressure, event loop scaling
MAX_SHADOW_SOCKETS=64 in net-slirp.c is a scaling constraint for server workloads with many concurrent connections; Phase 3 may need to raise or dynamically grow this
- ICMP: SLIRP can synthesize localhost replies without special capabilities; external ping requires CAP_NET_RAW
References
src/seccomp-dispatch.c : shadow socket bridge architecture comment; forward_bind() (already implemented); socket syscall dispatch table (no listen/accept entries)
src/seccomp-bpf.c : sendmsg allow-list with SCM_RIGHTS justification
src/net-slirp.c : MAX_SHADOW_SOCKETS=64
tests/guest/net-dns-test.c : existing UDP DNS test
scripts/run-tests.sh : shell-level TCP coverage via wget
Rationale
The current codebase delivered outbound SLIRP networking including sendmsg routing and
shell-level TCP integration tests (wget under --net). Three gaps remain:
(a) Server sockets:
bind()is implemented, butlisten()andaccept()/accept4()are not. The AF_UNIX socketpair bridge architecture does not support inbound connections -- each accepted connection would need a new socketpair + ADDFD injection into the tracee.(b) sendmsg allow-list removal:
sendmsgis BPF allow-listed because the supervisor uses SCM_RIGHTS for pre-exec FD passing. Guestsendmsgwithmsg_name(unconnected UDP destination) bypasses the supervisor entirely and the destination address is lost on the AF_UNIX socketpair. Control-message semantics are also affected.(c) Dedicated TCP regression test: integration tests exercise TCP via shell-level
wget, but no dedicated guest test binary validates the connect/send/recv path directly in CI.Proposed Changes
Phase 1 -- TCP regression test (low risk, immediate CI value):
net-tcp-test.cguest binary: connect to a known TCP endpoint (e.g., SLIRP gateway), send request, verify response.Phase 2 -- pidfd_getfd migration:
pidfd_getfd()(Linux 5.6+) or/proc/<pid>/fd/for pre-exec FD passing, removing the need to BPF allow-listsendmsg.sendmsgin the dispatcher, extractmsg_nameand control messages, forward to LKL for unconnected UDP and ancillary data.Phase 3 -- Server sockets:
forward_listen(): forward to LKL, register listening socket with SLIRP event loop.forward_accept()/forward_accept4(): on LKL accept, create new socketpair, register with event loop, inject into tracee via ADDFD.Considerations
pidfd_getfd(Linux 5.6+); verify this aligns with kbox's minimum supported kernel versionMAX_SHADOW_SOCKETS=64innet-slirp.cis a scaling constraint for server workloads with many concurrent connections; Phase 3 may need to raise or dynamically grow thisReferences
src/seccomp-dispatch.c: shadow socket bridge architecture comment;forward_bind()(already implemented); socket syscall dispatch table (no listen/accept entries)src/seccomp-bpf.c: sendmsg allow-list with SCM_RIGHTS justificationsrc/net-slirp.c:MAX_SHADOW_SOCKETS=64tests/guest/net-dns-test.c: existing UDP DNS testscripts/run-tests.sh: shell-level TCP coverage via wget