Skip to content

zpl: handle suspend from two remaining calls to txg_wait_synced() #17413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2025

Conversation

robn
Copy link
Member

@robn robn commented Jun 2, 2025

Motivation and Context

Following #17355, there's two remaining ZPL ops that use txg_wait_synced() directly. This converts them to allow suspend.

(This came from review on #17398, but not really related).

Description

Convert the two call sites to txg_wait_synced_flags(), setting TXG_WAIT_SUSPEND when failmode=continue, and handling a suspend (ESHUTDOWN) by returning EIO.

(Aside: this particular pattern of checking spa_failmode, choosing the right suspend flags and making the call could probably be a macro, but I couldn't think of a great name right now, and I can live with it for two calls. It's probably ok to wait for the moment, at least until a similar question in #17398 is resolved, and until an update to #11082 is posted, as I know @ihoro has some "convert the errors" ideas in there).

How Has This Been Tested?

Compile checked on Linux and FreeBSD. Will rely on CI tests to make sure I haven't broken anything. Nothing tests these codepaths directly though.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

robn added 2 commits June 4, 2025 20:32
4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_link() can fall back to
txg_wait_synced() if it has to wait for a tempfile to be fully created
before continuing, which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
4653e2f (openzfs#17355) allows dmu_tx_assign() to fail if the pool suspends
when failmode=continue, but zfs_clone_range() can fall back to
txg_wait_synced() if it has to wait for a dirty block to be written out,
which will block if the pool suspends.

Handle this by requesting an error return if the pool suspends when
failmode=continue, and if that happens, return EIO.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
@robn robn force-pushed the zpl-txg-suspend-break branch from da29c78 to a13f25e Compare June 4, 2025 10:33
@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label Jun 4, 2025
@amotin amotin merged commit af7d609 into openzfs:master Jun 5, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants