Commit d4b450c
Btrfs: fix race between transaction commit and empty block group removal
Committing a transaction can race with automatic removal of empty block
groups (cleaner kthread), leading to a BUG_ON() in the transaction
commit code while running btrfs_finish_extent_commit(). The following
sequence diagram shows how it can happen:
CPU 1 CPU 2
btrfs_commit_transaction()
fs_info->running_transaction = NULL
btrfs_finish_extent_commit()
find_first_extent_bit()
-> found range for block group X
in fs_info->freed_extents[]
btrfs_delete_unused_bgs()
-> found block group X
Removed block group X's range
from fs_info->freed_extents[]
btrfs_remove_chunk()
btrfs_remove_block_group(bg X)
unpin_extent_range(bg X range)
btrfs_lookup_block_group(bg X)
-> returns NULL
-> BUG_ON()
The trace that results from the BUG_ON() is:
[48665.187808] ------------[ cut here ]------------
[48665.188032] kernel BUG at fs/btrfs/extent-tree.c:5675!
[48665.188032] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[48665.188032] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop parport_pc evdev microcode
[48665.197388] CPU: 2 PID: 31211 Comm: kworker/u32:16 Tainted: G W 3.19.0-rc5-btrfs-next-4+ #1
[48665.197388] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[48665.197388] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
[48665.197388] task: ffff880222011810 ti: ffff8801b56a4000 task.ti: ffff8801b56a4000
[48665.197388] RIP: 0010:[<ffffffffa0350d05>] [<ffffffffa0350d05>] unpin_extent_range+0x6a/0x1ba [btrfs]
[48665.197388] RSP: 0018:ffff8801b56a7b88 EFLAGS: 00010246
[48665.197388] RAX: 0000000000000000 RBX: ffff8802143a6000 RCX: ffff8802220120c8
[48665.197388] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8800a3c140b0
[48665.197388] RBP: ffff8801b56a7bd8 R08: 0000000000000003 R09: 0000000000000000
[48665.197388] R10: 0000000000000000 R11: 000000000000bbac R12: 0000000012e8e000
[48665.197388] R13: ffff8800a3c14000 R14: 0000000000000000 R15: 0000000000000000
[48665.197388] FS: 0000000000000000(0000) GS:ffff88023ec40000(0000) knlGS:0000000000000000
[48665.197388] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[48665.197388] CR2: 00007f065e42f270 CR3: 0000000206f70000 CR4: 00000000000006e0
[48665.197388] Stack:
[48665.197388] ffff8801b56a7bd8 0000000012ea0000 01ff8800a3c14138 0000000012e9ffff
[48665.197388] ffff880141df3dd8 ffff8802143a6000 ffff8800a3c14138 ffff880141df3df0
[48665.197388] ffff880141df3dd8 0000000000000000 ffff8801b56a7c08 ffffffffa0354227
[48665.197388] Call Trace:
[48665.197388] [<ffffffffa0354227>] btrfs_finish_extent_commit+0xb0/0xd9 [btrfs]
[48665.197388] [<ffffffffa0366b4b>] btrfs_commit_transaction+0x791/0x92c [btrfs]
[48665.197388] [<ffffffffa0352432>] flush_space+0x43d/0x452 [btrfs]
[48665.197388] [<ffffffff814295c3>] ? _raw_spin_unlock+0x28/0x33
[48665.197388] [<ffffffffa035255f>] btrfs_async_reclaim_metadata_space+0x118/0x164 [btrfs]
[48665.197388] [<ffffffff81059917>] ? process_one_work+0x14b/0x3ab
[48665.197388] [<ffffffff810599ac>] process_one_work+0x1e0/0x3ab
[48665.197388] [<ffffffff81079fa9>] ? trace_hardirqs_off+0xd/0xf
[48665.197388] [<ffffffff8105a55b>] worker_thread+0x210/0x2d0
[48665.197388] [<ffffffff8105a34b>] ? rescuer_thread+0x2c3/0x2c3
[48665.197388] [<ffffffff8105e5c0>] kthread+0xef/0xf7
[48665.197388] [<ffffffff81429682>] ? _raw_spin_unlock_irq+0x2d/0x39
[48665.197388] [<ffffffff8105e4d1>] ? __kthread_parkme+0xad/0xad
[48665.197388] [<ffffffff81429dec>] ret_from_fork+0x7c/0xb0
[48665.197388] [<ffffffff8105e4d1>] ? __kthread_parkme+0xad/0xad
[48665.197388] Code: 85 f6 74 14 49 8b 06 49 03 46 09 49 39 c4 72 1d 4c 89 f7 e8 83 ec ff ff 4c 89 e6 4c 89 ef e8 1e f1 ff ff 48 85 c0 49 89 c6 75 02 <0f> 0b 49 8b 1e 49 03 5e 09 48 8b
[48665.197388] RIP [<ffffffffa0350d05>] unpin_extent_range+0x6a/0x1ba [btrfs]
[48665.197388] RSP <ffff8801b56a7b88>
[48665.272246] ---[ end trace b9c6ab9957521376 ]---
Fix this by ensuring that unpining the block group's range in
btrfs_finish_extent_commit() is done in a synchronized fashion
with removing the block group's range from freed_extents[]
in btrfs_delete_unused_bgs()
This race got introduced with the change:
Btrfs: remove empty block groups automatically
commit 47ab2a6
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>1 parent e3540ea commit d4b450c
3 files changed
+22
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1744 | 1744 | | |
1745 | 1745 | | |
1746 | 1746 | | |
| 1747 | + | |
1747 | 1748 | | |
1748 | 1749 | | |
1749 | 1750 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2242 | 2242 | | |
2243 | 2243 | | |
2244 | 2244 | | |
| 2245 | + | |
2245 | 2246 | | |
2246 | 2247 | | |
2247 | 2248 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5735 | 5735 | | |
5736 | 5736 | | |
5737 | 5737 | | |
| 5738 | + | |
5738 | 5739 | | |
5739 | 5740 | | |
5740 | | - | |
| 5741 | + | |
| 5742 | + | |
5741 | 5743 | | |
| 5744 | + | |
5742 | 5745 | | |
5743 | 5746 | | |
5744 | 5747 | | |
5745 | 5748 | | |
5746 | 5749 | | |
5747 | 5750 | | |
5748 | 5751 | | |
| 5752 | + | |
5749 | 5753 | | |
5750 | 5754 | | |
5751 | 5755 | | |
| |||
9561 | 9565 | | |
9562 | 9566 | | |
9563 | 9567 | | |
| 9568 | + | |
| 9569 | + | |
| 9570 | + | |
| 9571 | + | |
| 9572 | + | |
| 9573 | + | |
| 9574 | + | |
| 9575 | + | |
| 9576 | + | |
| 9577 | + | |
| 9578 | + | |
| 9579 | + | |
9564 | 9580 | | |
9565 | 9581 | | |
9566 | 9582 | | |
| 9583 | + | |
9567 | 9584 | | |
9568 | 9585 | | |
9569 | 9586 | | |
9570 | 9587 | | |
9571 | 9588 | | |
9572 | 9589 | | |
| 9590 | + | |
9573 | 9591 | | |
9574 | 9592 | | |
9575 | 9593 | | |
| 9594 | + | |
9576 | 9595 | | |
9577 | 9596 | | |
9578 | 9597 | | |
| |||
0 commit comments