Commit 83e1226
nvme-tcp: fix possible circular locking when deleting a controller under memory pressure
When destroying a queue, when calling sock_release, the network stack
might need to allocate an skb to send a FIN/RST. When that happens
during memory pressure, there is a need to reclaim memory, which
in turn may ask the nvme-tcp device to write out dirty pages, however
this is not possible due to a ctrl teardown that is going on.
Set PF_MEMALLOC to the task that releases the socket to grant access
to PF_MEMALLOC reserves. In addition, do the same for the nvme-tcp
thread as this may also originate from the swap itself and should
be more resilient to memory pressure situations.
This fixes the following lockdep complaint:
--
======================================================
WARNING: possible circular locking dependency detected
6.0.0-rc2+ #25 Tainted: G W
------------------------------------------------------
kswapd0/92 is trying to acquire lock:
ffff888114003240 (sk_lock-AF_INET-NVME){+.+.}-{0:0}, at: tcp_sendpage+0x23/0xa0
but task is already holding lock:
ffffffff97e95ca0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x987/0x10d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0x11e/0x160
kmem_cache_alloc_node+0x44/0x530
__alloc_skb+0x158/0x230
tcp_send_active_reset+0x7e/0x730
tcp_disconnect+0x1272/0x1ae0
__tcp_close+0x707/0xd90
tcp_close+0x26/0x80
inet_release+0xfa/0x220
sock_release+0x85/0x1a0
nvme_tcp_free_queue+0x1fd/0x470 [nvme_tcp]
nvme_do_delete_ctrl+0x130/0x13d [nvme_core]
nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
kernfs_fop_write_iter+0x356/0x530
vfs_write+0x4e8/0xce0
ksys_write+0xfd/0x1d0
do_syscall_64+0x58/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
-> #0 (sk_lock-AF_INET-NVME){+.+.}-{0:0}:
__lock_acquire+0x2a0c/0x5690
lock_acquire+0x18e/0x4f0
lock_sock_nested+0x37/0xc0
tcp_sendpage+0x23/0xa0
inet_sendpage+0xad/0x120
kernel_sendpage+0x156/0x440
nvme_tcp_try_send+0x48a/0x2630 [nvme_tcp]
nvme_tcp_queue_rq+0xefb/0x17e0 [nvme_tcp]
__blk_mq_try_issue_directly+0x452/0x660
blk_mq_plug_issue_direct.constprop.0+0x207/0x700
blk_mq_flush_plug_list+0x6f5/0xc70
__blk_flush_plug+0x264/0x410
blk_finish_plug+0x4b/0xa0
shrink_lruvec+0x1263/0x1ea0
shrink_node+0x736/0x1a80
balance_pgdat+0x740/0x10d0
kswapd+0x5f2/0xaf0
kthread+0x256/0x2f0
ret_from_fork+0x1f/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(sk_lock-AF_INET-NVME);
lock(fs_reclaim);
lock(sk_lock-AF_INET-NVME);
*** DEADLOCK ***
3 locks held by kswapd0/92:
#0: ffffffff97e95ca0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x987/0x10d0
#1: ffff88811f21b0b0 (q->srcu){....}-{0:0}, at: blk_mq_flush_plug_list+0x6b3/0xc70
#2: ffff888170b11470 (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0xeb9/0x17e0 [nvme_tcp]
Fixes: 3f2304f ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>1 parent 5fa9add commit 83e1226
1 file changed
+10
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1141 | 1141 | | |
1142 | 1142 | | |
1143 | 1143 | | |
| 1144 | + | |
1144 | 1145 | | |
1145 | 1146 | | |
1146 | 1147 | | |
| |||
1150 | 1151 | | |
1151 | 1152 | | |
1152 | 1153 | | |
| 1154 | + | |
1153 | 1155 | | |
1154 | 1156 | | |
1155 | 1157 | | |
1156 | 1158 | | |
1157 | 1159 | | |
1158 | | - | |
| 1160 | + | |
1159 | 1161 | | |
1160 | 1162 | | |
1161 | 1163 | | |
| |||
1181 | 1183 | | |
1182 | 1184 | | |
1183 | 1185 | | |
| 1186 | + | |
| 1187 | + | |
1184 | 1188 | | |
1185 | 1189 | | |
1186 | 1190 | | |
| |||
1296 | 1300 | | |
1297 | 1301 | | |
1298 | 1302 | | |
| 1303 | + | |
1299 | 1304 | | |
1300 | 1305 | | |
1301 | 1306 | | |
| |||
1308 | 1313 | | |
1309 | 1314 | | |
1310 | 1315 | | |
| 1316 | + | |
| 1317 | + | |
1311 | 1318 | | |
| 1319 | + | |
| 1320 | + | |
1312 | 1321 | | |
1313 | 1322 | | |
1314 | 1323 | | |
| |||
0 commit comments