Skip to content

linux/uio: remove "skip" offset for UIO_ITER #17298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2025

Conversation

robn
Copy link
Member

@robn robn commented May 4, 2025

Motivation and Context

torvalds/linux@f2fed44 changes the Linux loop device to (effectively) pass the iterator received from the upper filesystem down the the lower filesystem, where previously it would walk the iterator, create a new single-item iterator over each element, and call into the VFS layer for service.

This change exposed a bug in UIO_ITER uios, which wrap iov_iter: when we advance the uio, if the offset of the data inside the iov_iter is non-zero (the "skip"), we would try to advance the iterator ourselves. But, this is not our iterator, and we don't have to do that - the iterator internals will sort that out if it needs to.

The end result is that if we get an iterator with a first element with a non-zero offset, every time we advance we would skip that many bytes. This either lands somewhere in the middle of the iterator, getting the wrong data, or runs off the end, usually resulting in an EFAULT somewhere.

In the case of #17277, the iterator passed down from XFS consistently contains segments with non-zero offsets. The additional skipping runs the iterator position off the end of the available data, and results in an EFAULT loop in zpl_write_iter().

Fixes #17277.

Description

The bug proper is fixed by removing the unnecessary advance in zfs_uiomove_iter() and just let the underlying iterator sort it out. (note: the first version of this PR did only this).

As it turns out, we don't even set the skip for UIO_ITER for read, and in the few other places we try to use it (some of the direct IO support code), we don't do anything meaningful with it, so I've removed all traces of skip handling for UIO_ITER.

How Has This Been Tested?

The test case in #17277 is to create a file on a pool, use losetup to create a loop device over it, then format the loop device as XFS and try to mount it. Something like:

zpool create tank ...
truncate -s 512M /tank/img
dev=$(losetup -f /tank/img)
mkfs.xfs $dev
mount $dev /mnt

On kernels with the aforementioned change (eg 6.12.25), without this PR, the mount call hangs. With this patch, it succeeds.

I've done this same test on all release kernels back to 4.19 without the upstream change but with this PR, and that experiment still succeeds.

I do not have a more specific test case.

The direct suite appears to pass correctly, but with some tracing shows that from->iov_offset in zpl_iter_write was always zero, so if it's possible for it to be anything else in the direct IO cases, I have not tested it.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to make sense, but I think it is only a part of the confusion. I see zfs_uio_iov_iter_init() receiving skip of 0 in zpl_iter_read(), but iov_offset in zpl_iter_write(). Same time aside of this one place I don't see uio_skip ever used for UIO_ITER uios. Could we also remove the skip parameter from zfs_uio_iov_iter_init()?

@amotin amotin added the Status: Code Review Needed Ready for review and testing label May 5, 2025
@robn robn force-pushed the iov-iter-no-skip branch from fee71ad to 5a25ca1 Compare May 6, 2025 11:11
@robn robn changed the title zfs_uiomove_iter: ignore the "skip" offset linux/uio: remove "skip" offset for UIO_ITER May 6, 2025
@robn
Copy link
Member Author

robn commented May 6, 2025

It seems to make sense, but I think it is only a part of the confusion. I see zfs_uio_iov_iter_init() receiving skip of 0 in zpl_iter_read(), but iov_offset in zpl_iter_write(). Same time aside of this one place I don't see uio_skip ever used for UIO_ITER uios. Could we also remove the skip parameter from zfs_uio_iov_iter_init()?

@amotin agreed. I was wary, because I could see it being used in some of the direct IO support code and I wasn't sure. I've not studied it closely, but it does seem like the skip is not used in any interesting way there, as the kernel is asked for pages and the whole point of direct IO is that it's all properly aligned anyway.

So, last push removes setting the skip for UIO_ITER (removes the whole parameter entirely). PR title and description updated. Retesting all looks good. Nice cleanup I reckon :)

@tonyhutter
Copy link
Contributor

@robn I wrote a test case based off the reproducer in the PR description. Sure enough, the test case reliably hangs on Fedora 40-42, but passes with this PR.

@tonyhutter tonyhutter added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels May 7, 2025
@amotin amotin added Status: Revision Needed Changes are required for the PR to be accepted and removed Status: Accepted Ready to integrate (reviewed, tested) labels May 7, 2025
@robn
Copy link
Member Author

robn commented May 7, 2025

Sigh, this is why my original patch was focused on a specific case - I was here to fix a bug, not rework the whole thing. I'll have a look at next opportunity, maybe not until tomorrow or the weekend.

@robn robn force-pushed the iov-iter-no-skip branch from 5a25ca1 to a9016ed Compare May 9, 2025 00:16
@github-actions github-actions bot removed the Status: Revision Needed Changes are required for the PR to be accepted label May 9, 2025
@robn
Copy link
Member Author

robn commented May 9, 2025

@amotin @bwatkinson I've (effectively) reverted the change in zfs_uio_pin_user_pages() and zfs_uio_get_dio_pages_iov_iter(). Still not capturing the offset as uio_skip, just looking it up in uio->uio_iter->iov_offset when we need it. Hopefully that's the same.

@robn robn force-pushed the iov-iter-no-skip branch from a9016ed to fc94be1 Compare May 9, 2025 00:58
@robn
Copy link
Member Author

robn commented May 9, 2025

I've (effectively) reverted the change

And the rest of it in the last push, sigh.

@robn robn force-pushed the iov-iter-no-skip branch from fc94be1 to a5b1faa Compare May 9, 2025 01:41
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
@robn robn force-pushed the iov-iter-no-skip branch from a5b1faa to 47798ec Compare May 9, 2025 02:30
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. Thanks.

@robn
Copy link
Member Author

robn commented May 9, 2025

Heh, and to you :)

@amotin amotin added the Status: Accepted Ready to integrate (reviewed, tested) label May 9, 2025
@mcmilk
Copy link
Contributor

mcmilk commented May 10, 2025

Wow, it looked so simple in the beginning... really good work to all 👍

@amotin amotin merged commit 2ee5b51 into openzfs:master May 11, 2025
24 checks passed
robn added a commit to robn/zfs that referenced this pull request May 12, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#17298
(cherry picked from commit 2ee5b51)
robn added a commit to robn/zfs that referenced this pull request May 12, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#17298
(cherry picked from commit 2ee5b51)
@robn robn mentioned this pull request May 12, 2025
13 tasks
robn added a commit to robn/zfs that referenced this pull request May 23, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#17298
(cherry picked from commit 2ee5b51)
@robn robn mentioned this pull request May 23, 2025
14 tasks
robn added a commit to robn/zfs that referenced this pull request May 24, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#17298
(cherry picked from commit 2ee5b51)
tonyhutter pushed a commit that referenced this pull request May 27, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17298
(cherry picked from commit 2ee5b51)
tonyhutter pushed a commit that referenced this pull request May 28, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17298
(cherry picked from commit 2ee5b51)
snajpa pushed a commit to vpsfreecz/zfs that referenced this pull request Jun 5, 2025
For UIO_ITER, we are just wrapping a kernel iterator. It will take care
of its own offsets if necessary. We don't need to do anything, and if we
do try to do anything with it (like advancing the iterator by the skip
in zfs_uio_advance) we're just confusing the kernel iterator, ending up
at the wrong position or worse, off the end of the memory region.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes openzfs#17298
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kernel update breaks loopbacks on ZFS volumes
5 participants