Commit 2bbea6e
cdrom: do not call check_disk_change() inside cdrom_open()
when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.
PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount"
#0 [ffff880078447ae0] __schedule at ffffffff8168d605
#1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
#2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
hardkernel#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
hardkernel#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
hardkernel#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
hardkernel#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
hardkernel#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
hardkernel#8 [ffff880078447da8] mount_bdev at ffffffff81202570
hardkernel#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
hardkernel#10 [ffff880078447e28] mount_fs at ffffffff81202d09
hardkernel#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
hardkernel#12 [ffff880078447ea8] do_mount at ffffffff81220fee
hardkernel#13 [ffff880078447f28] sys_mount at ffffffff812218d6
hardkernel#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246
RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010
RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30
RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010
R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040
R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30
ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b
This task was trying to mount the cdrom. It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.
PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd"
#0 [ffff880078417898] __schedule at ffffffff8168d605
#1 [ffff880078417900] schedule at ffffffff8168dc59
#2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
hardkernel#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
hardkernel#4 [ffff8800784179d0] down_read at ffffffff8168cde0
hardkernel#5 [ffff8800784179e8] get_super at ffffffff81201cc7
hardkernel#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
hardkernel#7 [ffff880078417a40] flush_disk at ffffffff8123a94b
hardkernel#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
hardkernel#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
hardkernel#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
hardkernel#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
hardkernel#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
hardkernel#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
hardkernel#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
hardkernel#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
hardkernel#16 [ffff880078417d00] do_last at ffffffff8120d53d
hardkernel#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
hardkernel#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
hardkernel#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
hardkernel#20 [ffff880078417f70] sys_open at ffffffff811fde4e
hardkernel#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246
RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000
RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70
RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020
R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e
R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010
ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b
This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.
The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.
This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>1 parent da5ff37 commit 2bbea6e
5 files changed
+9
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
230 | 230 | | |
231 | 231 | | |
232 | 232 | | |
| 233 | + | |
| 234 | + | |
233 | 235 | | |
234 | 236 | | |
235 | 237 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1152 | 1152 | | |
1153 | 1153 | | |
1154 | 1154 | | |
1155 | | - | |
1156 | | - | |
1157 | | - | |
1158 | 1155 | | |
1159 | 1156 | | |
1160 | 1157 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
497 | 497 | | |
498 | 498 | | |
499 | 499 | | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
500 | 503 | | |
501 | 504 | | |
502 | 505 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1613 | 1613 | | |
1614 | 1614 | | |
1615 | 1615 | | |
| 1616 | + | |
| 1617 | + | |
1616 | 1618 | | |
1617 | 1619 | | |
1618 | 1620 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
525 | 525 | | |
526 | 526 | | |
527 | 527 | | |
| 528 | + | |
| 529 | + | |
528 | 530 | | |
529 | 531 | | |
530 | 532 | | |
| |||
0 commit comments