Skip to content

Initial stable API + ClusterHQ libzfs_core extensions + various fixes #3907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

ryao
Copy link
Contributor

@ryao ryao commented Oct 10, 2015

This contains the initial stable API, the libzfs_core extensions and various fixes that we have developed at ClusterHQ for use with flocker. It also cherry-picks the proposed fix for Illumos 6267 from #3874. This was needed to fix test failures at ClusterHQ.

The current code passes the functional tests in clusterhq/pyzfs and the flocker acceptance tests. There are no known regressions in ClusterHQ's tracker and the code is believed to be production quality.

@ryao ryao force-pushed the libzfs_core-HEAD branch 3 times, most recently from 780f6f0 to 47eb5bb Compare October 10, 2015 13:51
* Destroying snapshots and bookmarks is not currently supported. Call
* lzc_destroy_snaps and lzc_destroy_bookmarks for those respectively.
*
* The only currently valud property is the boolean "defer". It makes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo "valud" > "valid"

@ryao ryao force-pushed the libzfs_core-HEAD branch 3 times, most recently from b141e16 to b23828a Compare October 10, 2015 15:35
@ryao
Copy link
Contributor Author

ryao commented Oct 10, 2015

Some minor changes have been made.

  • The buildbot revealed cstyle issues in the header files. Those are now fixed.
  • @adamtheturtle's comment about a comment containing "valud" rather than "valid" is addressed. The manpage had the same mistake (as I copy and pasted the text), which has also been fixed.
  • Some minor corrections have been made to comments on the zfs_ioc_dataset_list_next().
  • Also, zfs_stable_ioc_set_props was developed against 0.6.4 where the old-style filesystem_limit and snapshot_limit properties did not exist. This was missed when we rebased against 0.6.5/HEAD causing attempts to atomically set them to be performed in a non-atomic manner. It now properly fails when asked to set filesystem and snapshot limits atomically. This is also now documented in the man page.

@behlendorf behlendorf added the Type: Feature Feature request or new feature label Oct 13, 2015
@@ -169,6 +171,29 @@ lzc_ioctl(zfs_ioc_t ioc, const char *name,
return (error);
}

static int
lzc_ioctl(const char *cmd, const char *name, nvlist_t *source,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between source (innvl) and opts? why separate them out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked during an earlier review; the answer was this was to let "options remain separate from actual data (e.g. a list of properties to set) so that we don't get pesky namespace collisions or inconsistencies from future extensions"

It also exists as an nvlist rather than a bitmask for said extensions. I'm still not entirely convinced this is necessary (worst case, nested lists would work).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrens My apologies for missing this comment. As @trisk said, the idea is to avoid mixing configuration from data.

@mmlb
Copy link

mmlb commented Nov 18, 2015

@ryao commit 66a2da lzc_rename seems to have broken recursive renames. Following script from https://docs.oracle.com/cd/E19082-01/817-2271/gevnp/index.html demonstrates:

truncate -s1G disk{1,2,3}.img
zpool create users $(realpath disk{1,2,3}.img)
zfs create users/home
for user in markm marks neil; do
    zfs create users/home/$user
done

zfs snapshot -r users/home@today
zfs list -t all

zfs rename -r users/home@today @yesterday
zfs list -t all

and errors with cannot rename 'users/home@marks': dataset does not exist

@ryao
Copy link
Contributor Author

ryao commented Nov 25, 2015

@mmlb Thanks for catching that. This is fixed in my local branch. I will clean it up and push it on either Thursday and Friday.

@ryao
Copy link
Contributor Author

ryao commented Dec 10, 2015

@behlendorf @ahrens @mmlb This has been refreshed. There have been several changes from the last push:

  1. Several fixes, including a fix for the lzc_rename issue that @mmlb highlighted
  2. The proposed lzc_promote() has had outnvl added (backward compatibility break with previous proposed version) to improve error handling.
  3. The logic for traversing bookmarks in the kernel side of lzc_list has moved into dmu_objset_find_dp() and dsl_bookmark_lookup_ds has been made avaliable for use with it. This makes it significantly more maintainable.
  4. lzc_list() has gained minimum depth traversal, which was needed for the next item.
  5. Several lzc_pool_* functions have been added to stabilize the boot process for / on ZFS.
  6. lzc_list_iter() has been added following feedback from Robert Mustacchi and the zfs_iter_generic() function previously used to hook into lzc_list() has been refactored to use it.
  7. The int zc_pad2 in zfs_cmd_t has been changed to uint64_t zc_real_err to store the error code before userland copying.
  8. All of the existing code has been refactored to use the new lzc_* functions, which @ahrens suggested.
  9. The kernel code for ZFS_IOC_POOL_STATS has been altered to stop processing a zfs_cmd_t member that userland never touches.
  10. Various functions have been constified for const-correctness.
  11. Robert Mustacchi pointed out that lzc_list() could violate resource restrictions. This is a problem on Illumos and other platforms. Code has been added that should create a userland thread whose kernel stack never switches to userspace so that resource constraints are observed. An attempt was made to do something similar on Linux, but Linux's kernel API makes this incredibly difficult, so it is not fully fleshed out, but I believe this is solveable. SInce Linux restricts /dev/zfs to root, this does not seem like it should be a priority.

There are a few minor outstanding things, but they should not impede review of this code:

  1. @mmlb pointed out that we are copying the full zfs_cmd_t into the kernel, which is not necessary. This is an existing issue in the libzfs_core code.
  2. @mmlb pointed out that lzc_receive() has not been converted to use the libzfs_core stable ioctl.
  3. Some minor man page fixes were suggested by @mmlb.
  4. My collegue wrote another lzc_* function, which I plan to add, but it wasn't included in Gentoo and really should not block review of these changes.
  5. The lzc_pool_* and lzc_list_iter() functions lack man pages. I plan to write these, but given how simple these functions are, they can be written asynchronously to review.
  6. The man pages need to be updated to reflect that the true error code is in errno after a lzc_* function returns while the function return code is the error after copying.
  7. Imposing resource limits on lzc_list()'s worker thread remains to be done.

An almost identical version was put into production in Gentoo Linux 1 week ago. The only difference is that lzc_list_iter() was not there.

So far, the only reported problem has been grub2-probe breaking when linked against the new libzfs.so on a system where an older kernel module is used. GRUB2 links to libzfs.so so that it can (ab)use it to read the cachefile and fallback to getting the cache file contents from the kernel if it is missing, which means it breaks every time that functionality changes in a backward incompatible way. The GRUB2 developers have requested functionality like lzc_pool_configs() in conversations about fixing this. I believe this is the last time GRUB2 will break after both this is merged and GRUB2 adopts it.

@ryao
Copy link
Contributor Author

ryao commented Dec 11, 2015

Clang's Address Sanitizer found a few issues in lzc_list_iter() and others that I had missed. I have revised lzc_list_iter() and pushed another patch for the other bits. The changes are rather small though.

@behlendorf
Copy link
Contributor

@ryao can these patches be structured such that intermediary patches in the stack don't break the build.

@ryao
Copy link
Contributor Author

ryao commented Dec 14, 2015

@behlendorf Yes. I am working on that now.

@ryao ryao force-pushed the libzfs_core-HEAD branch 7 times, most recently from a3b3606 to b0fc5d2 Compare August 27, 2016 03:01
ryao and others added 9 commits September 9, 2016 11:30
This is unused functionality that could bite us in the future.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
lzc_snaprange_space has a couple of bugs. One is that `usedp == NULL`
will cause a segmentation fault. Another is that passing snapshots from
different datasets will return ENOENT rather than EXDEV, which is
inconsistent. This corrects that to make this more predictable.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Some non-essential properties like mountpoint are lost during the
creation of a temporary pool because will try to set them on the pool's
permanent name, which does not exist inside the sPA namespace during
pool creation.

Most notably, the mountpoint property would be lost.

Signed-off-by: Richard Yao <ryao@gentoo.org>
@ryao ryao force-pushed the libzfs_core-HEAD branch 3 times, most recently from 9eb3767 to c585ca8 Compare September 9, 2016 20:09
@ryao ryao force-pushed the libzfs_core-HEAD branch 3 times, most recently from d03d464 to 8dad8bf Compare September 20, 2016 19:26
This introduces a new ZFS_IOC_LIBZFS_CORE ioctl and replaces the ioctl
calls in all but 1 function in libzfs_core.so and adds many more. It is
intended to be as stable kernel/userland interface that allows
basic userland functionality to continue working when an older userland
is used with newer kernel modules or vice versa by mistake.

At this time, it is a subset of the legacy functionality, but enough API
functions have been implemented that mismatched kernel/userland code
will not break the Gentoo's boot process (excluding zfs share and ZED,
which are excluded at this time) and ClusterHQ's flocker.

This sufficiently addresses the long standing /dev/zfs API instability
that it opens the possibility of stabilization of the ZFS packaging in
Gentoo on amd64 and other architectures, which makes this an improvement
in more ways than one.

To ease the transition to the new ABI, we include a shim library that
implements the new C library extensions using the old kernel ABI. The
--with-libzfs_core=default|all|legacy|stable configure switch determines
which is used and presently, the default is an alias for legacy that can
be changed in a future release. --with-libzfs_core=all will install
both, with libzfs_core.so.1 using the new ABI while a
libzfs_core-legacy.so.0 that can be loaded with LD_PRELOAD will be
installed containing the old one.

Signed-off-by: Richard Yao <ryao@gentoo.org>
@behlendorf behlendorf added the Status: Work in Progress Not yet ready for general review label Sep 30, 2016
@behlendorf
Copy link
Contributor

@ryao I'd like to close this PR until it's being actively worked on again. We can always open a new PR in the future.

@behlendorf behlendorf closed this Apr 28, 2017
@ryao
Copy link
Contributor Author

ryao commented Apr 28, 2017

@behlendorf Sure thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Work in Progress Not yet ready for general review Type: Feature Feature request or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants