-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Red Hat Enterprise Linux release 8.6 (Ootpa) |
| Distribution Version | 8.6 |
| Kernel Version | 4.18.0-372.26.1.el8_6.x86_64 |
| Architecture | x86_64 |
| OpenZFS Version | zfs-2.1.6 |
Describe the problem you're observing
During simultaneous failure of 2 vdevs on an empty draid2 zpool with 2 dspares, At times (2 out of 3 times) we observe permanent metadata errors and checksum errors on all vdevs reported in zpool status.
The frequency of the occurrence of the issue gets reduced as the pool gets filled up.
Detection of failure of the first drive, rebuild starts to first dspare.
Detection of the second drive failure leads to the vdev_rebuild_reset_wanted flag to be set, this is because, the existing rebuild thread has already completed where the vdev_rebuild_thread has become NULL, but the vdev_rebuild_complete_sync hasn't yet cleared the vdev_rebuilding. So the vdev_rebuild_reset_wanted signal is getting created but never handled.
As a result, even though the 2nd dspare gets attached, the rebuild never happened for the 2nd faulted drive. which is the issue as seen in zpool status.
Describe how to reproduce the problem
truncate -s 1G d{1..53}
zpool create -f -o cachefile=none -o failmode=panic -O canmount=off tank draid2:11d:53c:2s ~/disks/d{1..53}
zfs create -o mountpoint=/mnt/data tank/ds
zpool offline -f tank ~/disks/d1 & zpool offline -f tank ~/disks/d3
zpool status -v
Include any warning/errors/backtraces from the system logs
[root@localhost zfs]# zpool status -v
pool: tank
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 0B in 00:00:01 with 21 errors on Mon Oct 17 01:50:13 2022
scan: resilvered (draid2:11d:53c:2s-0) 35K in 00:00:00 with 0 errors on Mon Oct 17 01:50:12 2022
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
draid2:11d:53c:2s-0 DEGRADED 0 0 0
spare-0 DEGRADED 0 0 28
/root/disks/d1 FAULTED 0 0 0 external device fault
draid2-0-1 ONLINE 0 0 0
/root/disks/d2 ONLINE 0 0 0
spare-2 DEGRADED 0 0 20
/root/disks/d3 FAULTED 0 0 0 external device fault
draid2-0-0 ONLINE 0 0 0
/root/disks/d4 ONLINE 0 0 16
/root/disks/d5 ONLINE 0 0 28
/root/disks/d6 ONLINE 0 0 20
/root/disks/d7 ONLINE 0 0 20
/root/disks/d8 ONLINE 0 0 20
/root/disks/d9 ONLINE 0 0 28
/root/disks/d10 ONLINE 0 0 16
/root/disks/d11 ONLINE 0 0 16
/root/disks/d12 ONLINE 0 0 20
/root/disks/d13 ONLINE 0 0 28
/root/disks/d14 ONLINE 0 0 28
/root/disks/d15 ONLINE 0 0 20
/root/disks/d16 ONLINE 0 0 28
/root/disks/d17 ONLINE 0 0 20
/root/disks/d18 ONLINE 0 0 4
/root/disks/d19 ONLINE 0 0 24
/root/disks/d20 ONLINE 0 0 16
/root/disks/d21 ONLINE 0 0 20
/root/disks/d22 ONLINE 0 0 12
/root/disks/d23 ONLINE 0 0 20
/root/disks/d24 ONLINE 0 0 20
/root/disks/d25 ONLINE 0 0 24
/root/disks/d26 ONLINE 0 0 16
/root/disks/d27 ONLINE 0 0 16
/root/disks/d28 ONLINE 0 0 20
/root/disks/d29 ONLINE 0 0 32
/root/disks/d30 ONLINE 0 0 20
/root/disks/d31 ONLINE 0 0 20
/root/disks/d32 ONLINE 0 0 28
/root/disks/d33 ONLINE 0 0 28
/root/disks/d34 ONLINE 0 0 16
/root/disks/d35 ONLINE 0 0 32
/root/disks/d36 ONLINE 0 0 28
/root/disks/d37 ONLINE 0 0 0
/root/disks/d38 ONLINE 0 0 16
/root/disks/d39 ONLINE 0 0 16
/root/disks/d40 ONLINE 0 0 28
/root/disks/d41 ONLINE 0 0 28
/root/disks/d42 ONLINE 0 0 20
/root/disks/d43 ONLINE 0 0 20
/root/disks/d44 ONLINE 0 0 20
/root/disks/d45 ONLINE 0 0 40
/root/disks/d46 ONLINE 0 0 28
/root/disks/d47 ONLINE 0 0 20
/root/disks/d48 ONLINE 0 0 16
/root/disks/d49 ONLINE 0 0 12
/root/disks/d50 ONLINE 0 0 12
/root/disks/d51 ONLINE 0 0 20
/root/disks/d52 ONLINE 0 0 20
/root/disks/d53 ONLINE 0 0 24
spares
draid2-0-0 INUSE currently in use
draid2-0-1 INUSE currently in use
errors: Permanent errors have been detected in the following files:
<metadata>:<0x0>
<metadata>:<0x3d>