Skip to content

Merge sock_send/sock_recv with fd_write/fd_read #4

Closed
@sunfishcode

Description

@sunfishcode

WASI currently has two pairs of functions which are similar to each other: sock_send/sock_recv and fd_write/fd_read. This PR describes a plan for merging them, in favor of fd_write/fd_read.

Background

The reason why send and recv are separate in POSIX is that they add flags arguments. POSIX says that send and recv are equivalent to write and read when no flags are set.

WASI's sock_send doesn't currently support any flags. WASI's sock_recv supports __WASI_SOCK_RECV_PEEK and __WASI_SOCK_RECV_WAITALL which correspond to MSG_PEEK and MSG_WAITALL in POSIX. Both of these operations conceptually could work on files, however typical operating systems only support them on sockets.

On Linux, there is a subtle difference between recv and read: "If a zero-length datagram is pending, read(2) and recv() with a flags argument of zero provide different behavior. In this circumstance, read(2) has no effect (the datagram remains pending), while recv() consumes the pending datagram." It is possible that applications could depend on this subtle difference, but the only reference to it I've been able to find is the git commit which added this line to the man page, which describes a bug where "[...] we would end up in a busy loop when we were using read(2). Changing to recv(2) fixed the issue [...]". The recv behavior, is what the code in that bug wanted, and is the more intuitive behavior.

The cause of this subtlety is that read special-cases a 0 return value to mean the end-of-file/stream has been reached. That creates an ambiguity when reading a zero-length datagram.

Proposal

  • Remove sock_send and sock_recv.
  • Add __wasi_siflags_t and __wasi_riflags_t arguments to fd_write and fd_read, respectively.
  • Make fd_read return __WASI_EMSGSIZE when receiving a datagram which is larger than the provided buffer. And remove __WASI_SOCK_RECV_DATA_TRUNCATED, which is what sock_recv used in that case. WASI libc will check for this and to continue to implement the POSIX API (MSG_TRUNC).
  • Make fd_read return __WASI_EEOS, a new errno code, when the end-of-file/stream is reached. This eliminates the ambiguity of the special case for 0. WASI libc will check for this and continue to implement the POSIX API with 0 being a special case.
  • Add rights for __WASI_RIGHT_FD_READ_PEEK and __WASI_RIGHT_FD_READ_WAITALL, which are required to use the __WASI_SOCK_RECV_PEEK and __WASI_SOCK_RECV_WAITALL flags, respectively. These rights would not be granted for file-based file descriptors on OS's that don't support these features on files.
  • Remove the fs_filetype field from the fdstat_t struct. This further hides unnecessary differences between sockets and files. fd_fdstat_get is an otherwise ambient authority, meaning anyone can do it on any open file descriptor. The file type is still accessible, via fd_filestat_get, but that requires (__WASI_RIGHT_FD_FILESTAT_GET).
  • That happens to leave us with no easy way to implement isatty, so add a __WASI_RIGHT_FD_ISATTY right, to indicate whether a file descriptor is known to be a terminal. This is a little unusual as it's not a typical right, as it's not associated with an operation. However, this right makes it simple to implement isatty, which is used by libc to do line buffering for stdout when it's on a tty.

And some minor tidying:

  • Rename sock_shutdown to fd_shutdown, and make it a file descriptor operation that happens to depend on the __WASI_RIGHT_SOCK_SHUTDOWN right, which on typical implementations will only get granted for sockets. This is the last remaining sock_* function.
  • Rename __WASI_RIGHT_SOCK_SHUTDOWN to __WASI_RIGHT_FD_SHUTDOWN.
  • Rename __WASI_SOCK_RECV_PEEK and __WASI_SOCK_RECV_WAITALL to say FD_READ instead of SOCK_RECV.

Miscellaneous notes

The change to make fd_read return __WASI_EEOS on end-of-file/stream also fixes an oddity in POSIX in which many applications do an extra read call after the EOF is encountered, in order to get a 0 return from read to confirm they've actually reached the end. That said, implementations on POSIX hosts won't be able to report __WASI_EEOS until they get a 0 from read themselves, so in practice there will still be an extra read on such systems.

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalA discussion about a specific actionable itemwasi-filesystemIssues targeted for a `wasi_unstable_filesystem` modulewasi-ioIssues targeted for a `wasi_unstable_io` modulewasi-network-socketsIssues targeted for a possible `wasi_unstable_sockets` module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions