You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Explicitly exclude AF_UNIX, AF_PACKET, raw sockets.
Policy controls to restrict which destination addresses/ports are eligible
for bypass.
Additional syscall interception beyond socket() is required: connect, bind, listen, accept, setsockopt, getsockopt, and fcntl on bypassed FDs all need consideration in the dispatch layer
(seccomp-dispatch.c).
Considerations
epoll/poll multiplexing: LKL epoll cannot monitor host socket FDs and
vice versa. Bridging both worlds may require intercepting epoll_ctl and epoll_wait to implement a unified event loop. This is the hardest
architectural challenge and may limit applicability to simple
connect-send-recv workloads initially.
Network isolation tradeoff: the guest operates directly in the host
network namespace for bypassed sockets. bind() binds to host interfaces, getsockname() returns host IPs. This must be explicitly opt-in with
clear documentation of the security implications.
Relationship to passt backend and SLIRP Phase 2 (SLIRP: server sockets and pidfd_getfd #12): this is a
complementary approach. passt improves the stack-mediated path; socket
switching bypasses the stack entirely for eligible connections.
Address-family scope: start narrow (AF_INET/AF_INET6, SOCK_STREAM) and expand based on demand.
Problem
All network I/O goes through the SLIRP/passt userspace stack. For bulk
transfers, data-path overhead dominates.
Proposed Changes
Add an opt-in
--net=host-bypassfast path usingSECCOMP_IOCTL_NOTIF_ADDFD(the same mechanism already used for pipe injection in
forward_pipe()):socket()+connect()via seccomp for eligible sockets.read/write/send/recvgo directly to the host kernelwithout supervisor involvement on the data path.
Scope of first version:
AF_INET/AF_INET6,SOCK_STREAM, outboundconnect()only.AF_UNIX,AF_PACKET, raw sockets.for bypass.
Additional syscall interception beyond
socket()is required:connect,bind,listen,accept,setsockopt,getsockopt, andfcntlon bypassed FDs all need consideration in the dispatch layer(
seccomp-dispatch.c).Considerations
vice versa. Bridging both worlds may require intercepting
epoll_ctlandepoll_waitto implement a unified event loop. This is the hardestarchitectural challenge and may limit applicability to simple
connect-send-recv workloads initially.
network namespace for bypassed sockets.
bind()binds to host interfaces,getsockname()returns host IPs. This must be explicitly opt-in withclear documentation of the security implications.
complementary approach. passt improves the stack-mediated path; socket
switching bypasses the stack entirely for eligible connections.
AF_INET/AF_INET6,SOCK_STREAM) and expand based on demand.