Skip to content

Commit 5b2c554

Browse files
jrfastabborkmann
authored andcommitted
bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser()
Applications can be confused slightly because we do not always return the same error code as expected, e.g. what the TCP stack normally returns. For example on a sock err sk->sk_err instead of returning the sock_error we return EAGAIN. This usually means the application will 'try again' instead of aborting immediately. Another example, when a shutdown event is received we should immediately abort instead of waiting for data when the user provides a timeout. These tend to not be fatal, applications usually recover, but introduces bogus errors to the user or introduces unexpected latency. Before 'c5d2177a72a16' we fell back to the TCP stack when no data was available so we managed to catch many of the cases here, although with the extra latency cost of calling tcp_msg_wait_data() first. To fix lets duplicate the error handling in TCP stack into tcp_bpf so that we get the same error codes. These were found in our CI tests that run applications against sockmap and do longer lived testing, at least compared to test_sockmap that does short-lived ping/pong tests, and in some of our test clusters we deploy. Its non-trivial to do these in a shorter form CI tests that would be appropriate for BPF selftests, but we are looking into it so we can ensure this keeps working going forward. As a preview one idea is to pull in the packetdrill testing which catches some of this. Fixes: c5d2177 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220104205918.286416-1-john.fastabend@gmail.com
1 parent e4a41c2 commit 5b2c554

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

net/ipv4/tcp_bpf.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,12 +196,39 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
196196
long timeo;
197197
int data;
198198

199+
if (sock_flag(sk, SOCK_DONE))
200+
goto out;
201+
202+
if (sk->sk_err) {
203+
copied = sock_error(sk);
204+
goto out;
205+
}
206+
207+
if (sk->sk_shutdown & RCV_SHUTDOWN)
208+
goto out;
209+
210+
if (sk->sk_state == TCP_CLOSE) {
211+
copied = -ENOTCONN;
212+
goto out;
213+
}
214+
199215
timeo = sock_rcvtimeo(sk, nonblock);
216+
if (!timeo) {
217+
copied = -EAGAIN;
218+
goto out;
219+
}
220+
221+
if (signal_pending(current)) {
222+
copied = sock_intr_errno(timeo);
223+
goto out;
224+
}
225+
200226
data = tcp_msg_wait_data(sk, psock, timeo);
201227
if (data && !sk_psock_queue_empty(psock))
202228
goto msg_bytes_ready;
203229
copied = -EAGAIN;
204230
}
231+
out:
205232
release_sock(sk);
206233
sk_psock_put(sk, psock);
207234
return copied;

0 commit comments

Comments
 (0)