Skip to content

Commit cd6f79d

Browse files
Dave ChinnerDarrick J. Wong
authored andcommitted
xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks
Brian reported a null pointer dereference failure during unmount in xfs/006. He tracked the problem down to the AIL being torn down before a log shutdown had completed and removed all the items from the AIL. The failure occurred in this path while unmount was proceeding in another task: xfs_trans_ail_delete+0x102/0x130 [xfs] xfs_buf_item_done+0x22/0x30 [xfs] xfs_buf_ioend+0x73/0x4d0 [xfs] xfs_trans_committed_bulk+0x17e/0x2f0 [xfs] xlog_cil_committed+0x2a9/0x300 [xfs] xlog_cil_process_committed+0x69/0x80 [xfs] xlog_state_shutdown_callbacks+0xce/0xf0 [xfs] xlog_force_shutdown+0xdf/0x150 [xfs] xfs_do_force_shutdown+0x5f/0x150 [xfs] xlog_ioend_work+0x71/0x80 [xfs] process_one_work+0x1c5/0x390 worker_thread+0x30/0x350 kthread+0xd7/0x100 ret_from_fork+0x1f/0x30 This is processing an EIO error to a log write, and it's triggering a force shutdown. This causes the log to be shut down, and then it is running attached iclog callbacks from the shutdown context. That means the fs and log has already been marked as xfs_is_shutdown/xlog_is_shutdown and so high level code will abort (e.g. xfs_trans_commit(), xfs_log_force(), etc) with an error because of shutdown. The umount would have been blocked waiting for a log force completion inside xfs_log_cover() -> xfs_sync_sb(). The first thing for this situation to occur is for xfs_sync_sb() to exit without waiting for the iclog buffer to be comitted to disk. The above trace is the completion routine for the iclog buffer, and it is shutting down the filesystem. xlog_state_shutdown_callbacks() does this: { struct xlog_in_core *iclog; LIST_HEAD(cb_list); spin_lock(&log->l_icloglock); iclog = log->l_iclog; do { if (atomic_read(&iclog->ic_refcnt)) { /* Reference holder will re-run iclog callbacks. */ continue; } list_splice_init(&iclog->ic_callbacks, &cb_list); >>>>>> wake_up_all(&iclog->ic_write_wait); >>>>>> wake_up_all(&iclog->ic_force_wait); } while ((iclog = iclog->ic_next) != log->l_iclog); wake_up_all(&log->l_flush_wait); spin_unlock(&log->l_icloglock); >>>>>> xlog_cil_process_committed(&cb_list); } This wakes any thread waiting on IO completion of the iclog (in this case the umount log force) before shutdown processes all the pending callbacks. That means the xfs_sync_sb() waiting on a sync transaction in xfs_log_force() on iclog->ic_force_wait will get woken before the callbacks attached to that iclog are run. This results in xfs_sync_sb() returning an error, and so unmount unblocks and continues to run whilst the log shutdown is still in progress. Normally this is just fine because the force waiter has nothing to do with AIL operations. But in the case of this unmount path, the log force waiter goes on to tear down the AIL because the log is now shut down and so nothing ever blocks it again from the wait point in xfs_log_cover(). Hence it's a race to see who gets to the AIL first - the unmount code or xlog_cil_process_committed() killing the superblock buffer. To fix this, we just have to change the order of processing in xlog_state_shutdown_callbacks() to run the callbacks before it wakes any task waiting on completion of the iclog. Reported-by: Brian Foster <bfoster@redhat.com> Fixes: aad7272 ("xfs: separate out log shutdown callback processing") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
1 parent ab9c81e commit cd6f79d

File tree

1 file changed

+13
-9
lines changed

1 file changed

+13
-9
lines changed

fs/xfs/xfs_log.c

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -487,7 +487,10 @@ xfs_log_reserve(
487487
* Run all the pending iclog callbacks and wake log force waiters and iclog
488488
* space waiters so they can process the newly set shutdown state. We really
489489
* don't care what order we process callbacks here because the log is shut down
490-
* and so state cannot change on disk anymore.
490+
* and so state cannot change on disk anymore. However, we cannot wake waiters
491+
* until the callbacks have been processed because we may be in unmount and
492+
* we must ensure that all AIL operations the callbacks perform have completed
493+
* before we tear down the AIL.
491494
*
492495
* We avoid processing actively referenced iclogs so that we don't run callbacks
493496
* while the iclog owner might still be preparing the iclog for IO submssion.
@@ -501,22 +504,23 @@ xlog_state_shutdown_callbacks(
501504
struct xlog_in_core *iclog;
502505
LIST_HEAD(cb_list);
503506

504-
spin_lock(&log->l_icloglock);
505507
iclog = log->l_iclog;
506508
do {
507509
if (atomic_read(&iclog->ic_refcnt)) {
508510
/* Reference holder will re-run iclog callbacks. */
509511
continue;
510512
}
511513
list_splice_init(&iclog->ic_callbacks, &cb_list);
514+
spin_unlock(&log->l_icloglock);
515+
516+
xlog_cil_process_committed(&cb_list);
517+
518+
spin_lock(&log->l_icloglock);
512519
wake_up_all(&iclog->ic_write_wait);
513520
wake_up_all(&iclog->ic_force_wait);
514521
} while ((iclog = iclog->ic_next) != log->l_iclog);
515522

516523
wake_up_all(&log->l_flush_wait);
517-
spin_unlock(&log->l_icloglock);
518-
519-
xlog_cil_process_committed(&cb_list);
520524
}
521525

522526
/*
@@ -583,11 +587,8 @@ xlog_state_release_iclog(
583587
* pending iclog callbacks that were waiting on the release of
584588
* this iclog.
585589
*/
586-
if (last_ref) {
587-
spin_unlock(&log->l_icloglock);
590+
if (last_ref)
588591
xlog_state_shutdown_callbacks(log);
589-
spin_lock(&log->l_icloglock);
590-
}
591592
return -EIO;
592593
}
593594

@@ -3903,7 +3904,10 @@ xlog_force_shutdown(
39033904
wake_up_all(&log->l_cilp->xc_start_wait);
39043905
wake_up_all(&log->l_cilp->xc_commit_wait);
39053906
spin_unlock(&log->l_cilp->xc_push_lock);
3907+
3908+
spin_lock(&log->l_icloglock);
39063909
xlog_state_shutdown_callbacks(log);
3910+
spin_unlock(&log->l_icloglock);
39073911

39083912
return log_error;
39093913
}

0 commit comments

Comments
 (0)