Skip to content

Conversation

@d-e-s-o
Copy link
Contributor

@d-e-s-o d-e-s-o commented Aug 28, 2024

Switch over to using virtiofsd for sharing file system data with the
host. virtiofs is a file system designed for the needs of virtual
machines and environments. That is in contrast to 9P fs, which we
currently use for sharing data with the host, which is first and
foremost a network file system.
9P is problematic if for no other reason that it lacks proper support
for usage of the "open-unlink-fstat idiom", in which files are unlinked
and later referenced via file descriptor (see #83). virtiofs does not
have this problem.

This change replaces usage of 9P with that of virtiofs. In order to
work, virtiofs needs a user space server. The current state-of-the-art
implementation (virtiofsd) is implemented in Rust and so we interface
directly with the library. Most of this code is extracted straight from
virtiofsd, as it's a lot of boilerplate. An alternative approach is to
install the binary via distribution packages or from crates.io, but
availability (and discovery) can be a bit of a challenge. Note that this
now means that both libcap-ng as well as libseccomp need to be
installed.

I benchmarked both the current master as well as this version with a
bare-bones custom kernel:
Benchmark 1: target/release/vmtest -k bzImage-9p 'echo test'
Time (mean ± σ): 1.316 s ± 0.087 s [User: 0.462 s, System: 1.104 s]
Range (min … max): 1.232 s … 1.463 s 10 runs

Benchmark 1: target/release/vmtest -k bzImage-virtiofsd 'echo test'
Time (mean ± σ): 1.244 s ± 0.011 s [User: 0.307 s, System: 0.358 s]
Range (min … max): 1.227 s … 1.260 s 10 runs

So it seems there is a slight speed up, on average (and significantly
less system time being used). This is great, but I suspect a more
pronounced speed advantage will be visible when working with large
files, in which virtiofs is said to significantly outperform 9P
(typically >2x from what I understand, but I have not done any
benchmarks of that nature).

A few other notes:

  • we solely rely on guest level read-only mounts to enforce read-only
    state. The virtiofsd recommended way is to use read-only bind mounts
    [0], but doing so would require root.
  • we are not using DAX, because it still is still incomplete and
    apparently requires building Qemu (?) from source. In any event, it
    should not change anything functionally and be solely a performance
    improvement.

I have adjusted the configs, but because I don't have Docker handy I
can't really create those kernel. CI seems incapable of producing the
artifacts without doing a fully-blown release dance. No idea what empty
is about, really. I suspect the test failures we see are because it
lacks support?

[0] https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md?ref_type=heads#faq

@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 2 times, most recently from 62c084e to 3596ee3 Compare August 29, 2024 00:01
@danielocfb
Copy link

danielocfb commented Aug 29, 2024

Can't undraft right now, but this change can probably be looked at. Please advise re: empty kernels & test failures (which I suspect are related to them as well as the images containing incompat kernels).

@danielocfb danielocfb force-pushed the topic/virtiofs branch 6 times, most recently from f382adb to be7aefd Compare August 29, 2024 22:20
@d-e-s-o d-e-s-o changed the title [RFC][prototype] Use virtiofsd for sharing file system data with host Use virtiofsd for sharing file system data with host Aug 30, 2024
@d-e-s-o d-e-s-o marked this pull request as ready for review August 30, 2024 04:46
@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 3 times, most recently from f022560 to 2d776c6 Compare August 30, 2024 13:08
@d-e-s-o
Copy link
Contributor Author

d-e-s-o commented Aug 30, 2024

Found a few more things we could remove -> decreased patch size.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop these changes, please. all the config happens in scripts/docker/config_kernel.sh

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump; drop kconfig changes please. custom config goes in 2466d90

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 3 times, most recently from e149558 to f83412b Compare September 5, 2024 15:40
Copy link
Owner

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm. but I think the non-vendored libcap-ng and libseccomp deps break the static builds:

$ RUSTFLAGS="-C target-feature=+crt-static" cargo build --target x86_64-unknown-linux-gnu
[...]
  = note: /usr/bin/ld: cannot find -lm: No such file or directory
          /usr/bin/ld: cannot find -lc: No such file or directory
          /usr/bin/ld: cannot find -lc: No such file or directory
          collect2: error: ld returned 1 exit status


error: could not compile `vmtest` (bin "vmtest") due to 1 previous error

looking at capng and libseccomp-sys, I see they support static linking:

https://github.com/slp/capng/blob/25304f0b998d7a8a44d7e3b03598d73b56e2a555/build.rs#L17
https://github.com/libseccomp-rs/libseccomp-rs/blob/7d372c4cb2b11a666c58ed07b7c07f211092c55e/libseccomp-sys/build.rs#L23

and ubuntu ships static archives for both:

root@e4ca9ec837d8:/# ls /usr/lib/x86_64-linux-gnu/ -l | grep -e "\.a$"
-rw-r--r--. 1 root root      1918 Aug  8 14:47 libBrokenLocale.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libanl.a
-rw-r--r--. 1 root root   6134904 Aug  8 14:47 libc.a
-rw-r--r--. 1 root root      5594 Aug  8 14:47 libc_nonshared.a
-rw-r--r--. 1 root root     34208 Mar 31 07:45 libcap-ng.a
-rw-r--r--. 1 root root    271404 Apr  8 16:09 libcrypt.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libdl.a
-rw-r--r--. 1 root root      1686 Mar 31 07:45 libdrop_ambient.a
-rw-r--r--. 1 root root      1210 Aug  8 14:47 libg.a
-rw-r--r--. 1 root root   2607988 Aug  8 14:47 libm-2.39.a
-rw-r--r--. 1 root root       132 Aug  8 14:47 libm.a
-rw-r--r--. 1 root root      1480 Aug  8 14:47 libmcheck.a
-rw-r--r--. 1 root root   1830076 Aug  8 14:47 libmvec.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libpthread.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libpthread_nonshared.a
-rw-r--r--. 1 root root     88100 Aug  8 14:47 libresolv.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 librt.a
-rw-r--r--. 1 root root    273376 Aug  9 02:33 libseccomp.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libutil.a

Lemme mess around with it a bit and see if we can get away with just tweaking the CI

@danielocfb
Copy link

danielocfb commented Sep 5, 2024

Lemme mess around with it a bit and see if we can get away with just tweaking the CI

Perhaps try setting LIBSECCOMP_LIB_TYPE and LIBCAPNG_LINK_TYPEto static.

Edit: Oops, sorry, I missed that you found those already. Haha.

@danobi
Copy link
Owner

danobi commented Dec 9, 2024

[reminder to delete the uid hint in #104 when this merges]

danobi added a commit to danobi/bpftrace that referenced this pull request Jan 1, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
@danobi
Copy link
Owner

danobi commented Jan 1, 2025

I think bpftrace will also need this for its vmtest in CI work: bpftrace/bpftrace#3657

danobi added a commit to danobi/bpftrace that referenced this pull request Jan 4, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 6, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 10, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 12, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 13, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 13, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 14, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 14, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 14, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 15, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 15, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 21, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 22, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 27, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 30, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 30, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to bpftrace/bpftrace that referenced this pull request Jan 31, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
Move the gen_sock() helper function into the util module, where it will
be accessible by more than just the qemu module. Also switch over to
using std::env::temp_dir() for retrieving the path to the hosts temp
directory instead of hard coding it.

Signed-off-by: Daniel Müller <deso@posteo.net>
Switch over to using virtiofsd for sharing file system data with the
host. virtiofs is a file system designed for the needs of virtual
machines and environments. That is in contrast to 9P fs, which we
currently use for sharing data with the host, which is first and
foremost a network file system.
9P is problematic if for no other reason that it lacks proper support
for usage of the "open-unlink-fstat idiom", in which files are unlinked
and later referenced via file descriptor (see danobi#83). virtiofs does not
have this problem.

This change replaces usage of 9P with that of virtiofs. In order to
work, virtiofs needs a user space server. The current state-of-the-art
implementation (virtiofsd) is implemented in Rust and so we interface
directly with the library. Most of this code is extracted straight from
virtiofsd, as it's a lot of boilerplate. An alternative approach is to
install the binary via distribution packages or from crates.io, but
availability (and discovery) can be a bit of a challenge.

I benchmarked both the current master as well as this version with a
bare-bones custom kernel:
  Benchmark 1: target/release/vmtest -k bzImage-9p 'echo test'
    Time (mean ± σ):      1.316 s ±  0.087 s    [User: 0.462 s, System: 1.104 s]
    Range (min … max):    1.232 s …  1.463 s    10 runs

  Benchmark 1: target/release/vmtest -k bzImage-virtiofsd 'echo test'
    Time (mean ± σ):      1.244 s ±  0.011 s    [User: 0.307 s, System: 0.358 s]
    Range (min … max):    1.227 s …  1.260 s    10 runs

So it seems there is a slight speed up, on average (and significantly
less system time being used). This is great, but I suspect a more
pronounced speed advantage will be visible when working with large
files, in which virtiofs is said to significantly outperform 9P
(typically >2x from what I understand, but I have not done any
benchmarks of that nature).

A few other notes:
- we solely rely on guest level read-only mounts to enforce read-only
  state. The virtiofsd recommended way is to use read-only bind mounts
  [0], but doing so would require root.
- we are not using DAX, because it still is still incomplete and
  apparently requires building Qemu (?) from source. In any event, it
  should not change anything functionally and be solely a performance
  improvement. Given that we are not regressing in terms of performance,
  this is strictly future work.

Some additional resources worth keeping around:
  - https://virtio-fs.gitlab.io/howto-boot.html
  - https://virtio-fs.gitlab.io/howto-qemu.html

[0] https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md?ref_type=heads#faq

Closes: danobi#16
Closes: danobi#83

Signed-off-by: Daniel Müller <deso@posteo.net>
@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 2 times, most recently from 1f3d5f6 to 96cefd1 Compare February 12, 2025 23:09
@cccheng
Copy link

cccheng commented Nov 17, 2025

Virtiofs is expected to offer better latency performance compared to 9pfs. Is there a plan to merge this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants