Skip to content

Commit 9e22846

Browse files
pcd1193182Paul Dagnelie
authored andcommitted
Only interrupt active disk I/Os in failmode=continue
failmode=continue is in a sorry state. Originally designed to fix a very specific problem, it causes crashes and panics for most people who end up trying to use it. At this point, we should either remove it entirely, or try to make it more usable. With this patch, I choose the latter. While the feature is fundamentally unpredictable and prone to race conditions, it should be possible to get it to the point where it can at least sometimes be useful for some users. This patch fixes one of the major issues with failmode=continue: it interrupts even ZIOs that are patiently waiting in line behind stuck IOs. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Closes #17372
1 parent 5bbf200 commit 9e22846

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

module/zfs/zio.c

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2302,12 +2302,12 @@ zio_deadman_impl(zio_t *pio, int ziodepth)
23022302
zio_t *cio, *cio_next;
23032303
zio_link_t *zl = NULL;
23042304
vdev_t *vd = pio->io_vd;
2305+
uint64_t failmode = spa_get_deadman_failmode(pio->io_spa);
23052306

23062307
if (zio_deadman_log_all || (vd != NULL && vd->vdev_ops->vdev_op_leaf)) {
23072308
vdev_queue_t *vq = vd ? &vd->vdev_queue : NULL;
23082309
zbookmark_phys_t *zb = &pio->io_bookmark;
23092310
uint64_t delta = gethrtime() - pio->io_timestamp;
2310-
uint64_t failmode = spa_get_deadman_failmode(pio->io_spa);
23112311

23122312
zfs_dbgmsg("slow zio[%d]: zio=%px timestamp=%llu "
23132313
"delta=%llu queued=%llu io=%llu "
@@ -2331,11 +2331,15 @@ zio_deadman_impl(zio_t *pio, int ziodepth)
23312331
pio->io_error);
23322332
(void) zfs_ereport_post(FM_EREPORT_ZFS_DEADMAN,
23332333
pio->io_spa, vd, zb, pio, 0);
2334+
}
23342335

2335-
if (failmode == ZIO_FAILURE_MODE_CONTINUE &&
2336-
taskq_empty_ent(&pio->io_tqent)) {
2337-
zio_interrupt(pio);
2338-
}
2336+
if (vd != NULL && vd->vdev_ops->vdev_op_leaf &&
2337+
list_is_empty(&pio->io_child_list) &&
2338+
failmode == ZIO_FAILURE_MODE_CONTINUE &&
2339+
taskq_empty_ent(&pio->io_tqent) &&
2340+
pio->io_queue_state == ZIO_QS_ACTIVE) {
2341+
pio->io_error = EINTR;
2342+
zio_interrupt(pio);
23392343
}
23402344

23412345
mutex_enter(&pio->io_lock);

0 commit comments

Comments
 (0)