Commit 55f0bfc
af_packet: fix soft lockup issue caused by tpacket_snd()
When MSG_DONTWAIT is not set, the tpacket_snd operation will wait for
pending_refcnt to decrement to zero before returning. The pending_refcnt
is decremented by 1 when the skb->destructor function is called,
indicating that the skb has been successfully sent and needs to be
destroyed.
If an error occurs during this process, the tpacket_snd() function will
exit and return error, but pending_refcnt may not yet have decremented to
zero. Assuming the next send operation is executed immediately, but there
are no available frames to be sent in tx_ring (i.e., packet_current_frame
returns NULL), and skb is also NULL, the function will not execute
wait_for_completion_interruptible_timeout() to yield the CPU. Instead, it
will enter a do-while loop, waiting for pending_refcnt to be zero. Even
if the previous skb has completed transmission, the skb->destructor
function can only be invoked in the ksoftirqd thread (assuming NAPI
threading is enabled). When both the ksoftirqd thread and the tpacket_snd
operation happen to run on the same CPU, and the CPU trapped in the
do-while loop without yielding, the ksoftirqd thread will not get
scheduled to run. As a result, pending_refcnt will never be reduced to
zero, and the do-while loop cannot exit, eventually leading to a CPU soft
lockup issue.
In fact, skb is true for all but the first iterations of that loop, and
as long as pending_refcnt is not zero, even if incremented by a previous
call, wait_for_completion_interruptible_timeout() should be executed to
yield the CPU, allowing the ksoftirqd thread to be scheduled. Therefore,
the execution condition of this function should be modified to check if
pending_refcnt is not zero, instead of check skb.
- if (need_wait && skb) {
+ if (need_wait && packet_read_pending(&po->tx_ring)) {
As a result, the judgment conditions are duplicated with the end code of
the while loop, and packet_read_pending() is a very expensive function.
Actually, this loop can only exit when ph is NULL, so the loop condition
can be changed to while (1), and in the "ph = NULL" branch, if the
subsequent condition of if is not met, the loop can break directly. Now,
the loop logic remains the same as origin but is clearer and more obvious.
Fixes: 89ed5b5 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
Cc: stable@kernel.org
Suggested-by: LongJun Tang <tanglongjun@kylinos.cn>
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>1 parent c1ba3c0 commit 55f0bfc
1 file changed
+11
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2846 | 2846 | | |
2847 | 2847 | | |
2848 | 2848 | | |
2849 | | - | |
| 2849 | + | |
| 2850 | + | |
| 2851 | + | |
| 2852 | + | |
| 2853 | + | |
| 2854 | + | |
2850 | 2855 | | |
2851 | 2856 | | |
2852 | 2857 | | |
2853 | 2858 | | |
2854 | 2859 | | |
2855 | | - | |
2856 | | - | |
2857 | | - | |
| 2860 | + | |
| 2861 | + | |
| 2862 | + | |
| 2863 | + | |
2858 | 2864 | | |
2859 | 2865 | | |
2860 | 2866 | | |
| |||
2943 | 2949 | | |
2944 | 2950 | | |
2945 | 2951 | | |
2946 | | - | |
2947 | | - | |
2948 | | - | |
2949 | | - | |
2950 | | - | |
2951 | | - | |
2952 | | - | |
2953 | | - | |
| 2952 | + | |
2954 | 2953 | | |
2955 | 2954 | | |
2956 | 2955 | | |
| |||
0 commit comments