Skip to content
This repository was archived by the owner on Feb 26, 2020. It is now read-only.

Fix race between getf() and areleasef() #492

Closed
wants to merge 1 commit into from
Closed

Conversation

ryao
Copy link
Contributor

@ryao ryao commented Nov 5, 2015

If a vnode is released asynchronously through areleasef(), it is
possible for the user process to reuse the file descriptor before
areleasef is called. When this happens, getf() will return a stale
reference, any operations in the kernel on that file descriptor will
fail (as it is closed) and the operations meant for that fd will
never occur from userspace's perspective.

We correct this by detecting this condition in getf(), doing a putf on the old
file handle, updating the file descriptor and proceeding as if everything was
fine. When the areleasef() is done, it will harmlessly decrement the reference
counter on the Illumos file handle.

Signed-off-by: Richard Yao ryao@gentoo.org

@ryao
Copy link
Contributor Author

ryao commented Nov 5, 2015

There is a trivial merge conflict between this and #493. I separated this from #493 because this fixes an actual bug while #493 is just cleanup.

If a vnode is released asynchronously through areleasef(), it is
possible for the user process to reuse the file descriptor before
areleasef is called. When this happens, getf() will return a stale
reference, any operations in the kernel on that file descriptor will
fail (as it is closed) and the operations meant for that fd will
never occur from userspace's perspective.

We correct this by detecting this condition in getf(), doing a putf on the old
file handle, updating the file descriptor and proceeding as if everything was
fine. When the areleasef() is done, it will harmlessly decrement the reference
counter on the Illumos file handle.

Signed-off-by: Richard Yao <ryao@gentoo.org>
@tuxoko
Copy link
Contributor

tuxoko commented Nov 6, 2015

I don't understand why we even need this. Why don't we just fget(fd) every time instead of maintaining a separate file list.

@ryao
Copy link
Contributor Author

ryao commented Nov 9, 2015

@tuxoko That would introduce another issue in semantics. Specifically, userspace could call dup() on the file descriptor, close the original and put something else there. Calling vn_rdwr() on the vnode_t should use the original pipe while your suggestion would cause the pipe to change.

I do not think it is possible to close the underlying race completely due to a difference in close() behavior between Linux and Illumos, but a check for the condition in vn_getf() would prevent it from affecting the proposed lzc_core extensions by reducing the opportunity for us to trigger it while preserving the Illumos API.

@tuxoko
Copy link
Contributor

tuxoko commented Nov 9, 2015

@ryao
But how is my proposed different from your patch in terms of semantics?
You also call fget every time and swap out the old file struct in the vnode. I don't see how the list would change anything.

Edit: Not to mention that this file list only grows and will not shrink except rmmod. So if you run the system long enough. It will hold a lot of useless file references.

@ryao
Copy link
Contributor Author

ryao commented Nov 9, 2015

@tuxoko This patch ensures that you will get the correct file handle in the case that dup()/close() was used to change the file descriptor number with this patch. Keep in mind that the file descriptor number is an indirection. What we are supposed to reference is not the file descriptor number, but the kernel's file_t that the file descriptor number references. The only exception is on vn_getf() where we are saying to the kernel "grab the file_t referenced by this file descriptor in this process". What this patch does is implement a heuristic that if the process does that, the old one is probably no longer valid. That is not strictly true, but it is the best we can do to implement the Illumos API on Linux without breaking known users.

Also, the file list can both grow and shrink. Once areleasef()/releasef() are called to close the last reference, it will be removed from the list. See vn_areleasef()/releasef_locked(), which is where it is removed from the list and then freed.

@tuxoko
Copy link
Contributor

tuxoko commented Nov 9, 2015

@ryao

You will get the correct file handle in the case that dup()/close() was used to change the file descriptor number with this patch.

But that's because you call fget(fd), not becasue of the file list. I'm not arguing your patch is wrong. I'm arguing we don't need the file list.

@ryao
Copy link
Contributor Author

ryao commented Nov 9, 2015

@tuxoko There is no way to tell if anything leaked when the module is unloaded without the list. It also potentially allows us to save memory by avoiding duplicate SPL handles for the same Linux file handle.

@tuxoko
Copy link
Contributor

tuxoko commented Nov 9, 2015

@ryao
If the list is purely for checking leaks, then we probably should just use kmemleak. It's more general and we don't need to rmmod to check the result.

@ryao
Copy link
Contributor Author

ryao commented Nov 9, 2015

@tuxoko We do this for the kmem cache code as well. I suspect that this is done because the buildbot does not use kmemleak.

@behlendorf
Copy link
Contributor

Actually I had something a little more drastic in mind. I'd like to remove all of the vnode and kobj code from the SPL and add a simplified version to the ZFS source tree. It's served it's purpose well, but frankly it's a bit of a mess and burden.

@ryao @tuxoko I've opened the following pull requests with what I consider a rough first draft of what I'm proposing. It builds but hasn't seen any real testing. If you have time to provide feedback that would be great!

#497
openzfs/zfs#4005

As for this specific bug I happened to address it in the context of these changes. What do you think of openzfs/zfs@25e56b4aa7c95c6. Basically I just dumped this terrible confusing compatibility code.

@tuxoko
Copy link
Contributor

tuxoko commented Nov 11, 2015

@behlendorf
Yes! I haven't reviewed the code yet. But in terms of getf/releasef, this whole lot of stuff just for only a few places in ZFS doesn't feel justifiable. Thanks for the work.

@behlendorf
Copy link
Contributor

As for the buildbot it just so happens that I've largely finished updating it and migrating it to ec2. The updated buildbot is a lot more flexible and I think will be far more useful to us. The entire ZoL buildbot configuration is available at documented at https://github.com/zfsonlinux/zfs-buildbot/. This way anyone can help improve it and exactly how it works is clearly visible.

@behlendorf
Copy link
Contributor

@ryao is this an issue has caused real problems or is it somethings which is possible by in practice never happens? If it's causing real harm I could merge your fix now, but if it's not I'm inclined to wait until the larger #497 cleanup can be finalized.

@ryao
Copy link
Contributor Author

ryao commented Dec 3, 2015

@behlendorf It only affects the refresh I am preparing to do on openzfs/zfs#3907, which is where this bug occurred. That pull request is the only consumer of areleasef() and is why I implemented areleasef() in the first place. The changes you propose would require that I rewrite things to use Linux APIs. That would make porting patches back and forth much more difficult. I would rather get the libzfs_core extensions finalized and sent to illumos before doing any major refactoring here.

I am working on refreshing that pull request right now as I believe it is in a state where nearly all of the outstanding comments have been addressed.

@behlendorf
Copy link
Contributor

OK, then let's get your fix applied to the SPL. That will make it easier to get any upstream changes applied. Then we'll revisit updating the code in such a way that it can be supported easily on both Illumos and Linux.

@ryao if your happy with this as the final version of this patch let me know and I'll merge it.

@ryao
Copy link
Contributor Author

ryao commented Dec 3, 2015

@behlendorf I am happy with it.

@behlendorf behlendorf closed this in 1683e75 Dec 3, 2015
@behlendorf
Copy link
Contributor

Merged as:

1683e75 Fix race between getf() and areleasef()

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants