Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for fuse_custom_io feature #510

Open
legezywzh opened this issue Apr 8, 2024 · 4 comments
Open

Support for fuse_custom_io feature #510

legezywzh opened this issue Apr 8, 2024 · 4 comments
Labels
Feature request for a feature

Comments

@legezywzh
Copy link

Libfuse C language library supports the fuse_custom_io feature initially in below commit:
libfuse/libfuse@50c74e6

Traditionally fuse filesystem servers read fuse requests and write fuse responses by /dev/fuse character device, but in the case of virtiofs, fuse requests come from guest os, fuse filesystem servers run in host os, so /dev/fuse won't work here.

Many popular fuse filesystems based on go-fuse library, in the case of qemu virtiofs or dpu virtiofs, by fuse_custom_io feature, user can implement application-defined I/O functions to fetch and reply fuse requests. then we can let these fuse filesystems based on go-fuse deploied on host running qumu or dpu easily with least modifications.

Could the go-fuse community plan to support this fuse_custom_io feature(backport above patch)? Thanks in advance, for me, currently I did't write any go codes yet.

@hanwen hanwen added the Feature request for a feature label Apr 8, 2024
@hanwen
Copy link
Owner

hanwen commented Apr 8, 2024

this is the first time I hear of this. Some notes to myself:

  • similar in scope to NFS, but don´t need to support multiple simultaneous clients and can use shared memory for improved efficiency.
  • design: https://virtio-fs.gitlab.io/design.html
  • FUSE is called by virtiofsd
  • the FS cannot trust the incoming requests, as the guest OS is untrusted.

Open questions:

  • how can one test this?
  • how "untrusted" is the guest OS? The FUSE library implicitly assumes not only that requests are well-formed but also that the VFS lookups are sane. Can the guest OS trick the FS into creating a circular inode graph for example?

@legezywzh
Copy link
Author

this is the first time I hear of this. Some notes to myself:

  • similar in scope to NFS, but don´t need to support multiple simultaneous clients and can use shared memory for improved efficiency.
  • design: https://virtio-fs.gitlab.io/design.html
  • FUSE is called by virtiofsd
  • the FS cannot trust the incoming requests, as the guest OS is untrusted.

Open questions:

  • how can one test this?

There is a simple test case in libfuse and it justs transmits FUSE_INIT message on application-defined I/O functions, which is unix socket here, other than /dev/fuse.
https://github.com/libfuse/libfuse/blob/master/example/hello_ll_uds.c
https://github.com/libfuse/libfuse/blob/master/test/test_custom_io.py

For myself, I also think it's hard to have a full test with normal tools on hand, but it's a useful feature in certain cases. Currently I used it in dpu, dpu can help to offload virtiofsd or filesystem server from host machine to dpu,in this case,host machine and dpu runs different os, softwares on dpu will fetch vritio-fs reqeusts from dpu hardware, and use fuse_custom_io to define I/O functions, then we can use these I/O functions to interactive with filesystem server based on libfuse。

  • how "untrusted" is the guest OS? The FUSE library implicitly assumes not only that requests are well-formed but also that the VFS lookups are sane. Can the guest OS trick the FS into creating a circular inode graph for example?

I'm not sure, virtio-fs is widely used, seems that it doest not require the guest os is trusted.

@hanwen
Copy link
Owner

hanwen commented Sep 30, 2024

I looked at this a bit more. It is easy to make go-fuse read from a socket, but this is by far not enough to have the FS be accessible in QEMU or other hypervisors.

The spec is here

To note, FS spec is in section 5.11, https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-45800011

The FUSE protocol is followed, but communication goes over so-called virtq, described in sec 2.6. Before the FUSE protocol kicks in, there is a general negotiation that looks a bit like this in strace (tracing Rust virtiofs-d),

[pid 194180] recvmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1\0\0\0\1\0\0\0\0\0\0\0", iov_len=12}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 12
[pid 194180] sendmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1\0\0\0\5\0\0\0\10\0\0\0", iov_len=12}, {iov_base="\0\0\0p\1\0\0\0", iov_len=8}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 20
[pid 194180] recvmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\17\0\0\0\1\0\0\0\0\0\0\0", iov_len=12}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 12
[pid 194180] sendmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\17\0\0\0\5\0\0\0\10\0\0\0", iov_len=12}, {iov_base=")\204\0\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 20

Note that virtio is a general mechanism to provide device drivers (block storage, GPU, tty, crypto-random, etc.) from the host to the guest.

The opcodes are transferred through some shared memory shenenigans using virtqueues

control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[21]}], msg_controllen=24, msg_flags=0}, 0) = 12
[pid 194180] recvmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\0\0\0\0\0\0\0", iov_len=8}], msg_iovlen=1, msg_controllen
=0, msg_flags=0}, 0) = 8
[pid 194180] close(16)                  = 0
[pid 194180] recvmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\r\0\0\0\1\0\0\0\10\0\0\0", iov_len=12}], msg_iovlen=1, msg_
control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[16]}], msg_controllen=24, msg_flags=0}, 0) = 12
[pid 194180] recvmsg(11, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1\0\0\0\0\0\0\0", iov_len=8}], msg_iovlen=1, msg_controllen
=0, msg_flags=0}, 0) = 8
[pid 194180] close(12)                  = 0
[pid 194180] recvmsg(11,  <unfinished ...>
[pid 194166] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=1, u64=1}}], 100, -1) = 1
[pid 194166] read(20, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 194166] write(2, "[2024-09-30T08:13:54Z DEBUG virt"..., 51[2024-09-30T08:13:54Z DEBUG virtiofsd] QUEUE_EVENT
) = 51
[pid 194166] write(2, "[2024-09-30T08:13:54Z DEBUG virt"..., 108[2024-09-30T08:13:54Z DEBUG virtiofsd::server] Received request: opcode=Init (26), inode=0, unique=2, pid=0
) = 108

note how the Init opcode (26 = \32) is not transferred through a recvmsg syscall.

@hanwen
Copy link
Owner

hanwen commented Nov 10, 2024

I toyed with this over the last month, there is an alpha quality Virtio implementation in hte virtiofs branch,

https://github.com/hanwen/go-fuse/tree/virtiofs

it uses a ProtocolServer type introduced in https://review.gerrithub.io/c/hanwen/go-fuse/+/1203483. It looks like a significant amount of work to flesh this fully out, which requires time that I currently don't have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request for a feature
Projects
None yet
Development

No branches or pull requests

2 participants