Skip to content

std.os read/write functions + sendfile #4611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Mar 3, 2020

This merges #4131 and then does the following:

  • rework os.sendfile and add macosx support, and a fallback
    implementation for any OS.
  • fix sendto compile error
  • std.os write functions support partial writes. closes std.os.writev does not handle partial writes #3443.
  • std.os pread / pwrite functions can now return error.Unseekable.
  • std.fs.File read/write functions now have readAll/writeAll variants
    which loop to complete operations even when partial reads/writes
    happen.
  • Audit std.os read/write functions with respect to Linux returning
    EINVAL for lengths greater than 0x7fff0000.
  • std.os read/write shim functions do not unnecessarily loop. Since
    partial reads/writes are part of the API, the caller will be forced
    to loop anyway, and so that would just be code bloat.
  • Improve doc comments
  • Add a non-trivial test for std.os.sendfile
  • Fix std.os.pread on 32 bit Linux
  • Add missing SYS_sendfile bit on aarch64

terinjokes and others added 2 commits March 2, 2020 12:54
This changset adds a `sendfile(2)` syscall bindings to the linux bits
component. Where available, the `sendfile64(2)` syscall will be
transparently called.

A wrapping function has also been added to the std.os to transform
errno returns to Zig errors.

Change-Id: I86769fc4382c0771e3656e7b21137bafd99a4411
 * rework os.sendfile and add macosx support, and a fallback
   implementation for any OS.
 * fix sendto compile error
 * std.os write functions support partial writes. closes #3443.
 * std.os pread / pwrite functions can now return `error.Unseekable`.
 * std.fs.File read/write functions now have readAll/writeAll variants
   which loop to complete operations even when partial reads/writes
   happen.
 * Audit std.os read/write functions with respect to Linux returning
   EINVAL for lengths greater than 0x7fff0000.
 * std.os read/write shim functions do not unnecessarily loop. Since
   partial reads/writes are part of the API, the caller will be forced
   to loop anyway, and so that would just be code bloat.
 * Improve doc comments
 * Add a non-trivial test for std.os.sendfile
 * Fix std.os.pread on 32 bit Linux
 * Add missing SYS_sendfile bit on aarch64
@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. standard library This issue involves writing Zig code for the standard library. labels Mar 3, 2020
if (amt_read == 0) return off; // EOF
} else unreachable; // TODO https://github.com/ziglang/zig/issues/707
if (std.Target.current.os.tag == .windows) {
// TODO does Windows have a way to read an io vector?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks- I will open an issue

}
pub fn preadv(fd: fd_t, iov: []const iovec, offset: u64) PReadError!usize {
const have_pread_but_not_preadv = switch (std.Target.current.os.tag) {
.windows, .macosx, .ios, .watchos, .tvos => true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windows has NtReadFileScatter

/// If `iov.len` is larger than will fit in a `u31`, a partial write will occur.
pub fn writev(fd: fd_t, iov: []const iovec_const) WriteError!usize {
if (std.Target.current.os.tag == .windows) {
// TODO does Windows have a way to write an io vector?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

rw: {
var buf: [8 * 4096]u8 = undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use a LinearFifo here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It lets you gracefully handle partial reads and partial writes (which helps in both speed, number of syscalls and correctness)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is intentionally doing exactly one read/write and returning a partial write. The burden of proof is on you to show why LinearFifo would be better than this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry, I thought there was a loop here.
That'll come in the fallback path for copy_file_range later :)

Copy link
Member Author

@andrewrk andrewrk Mar 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even copy_file_range can return a partial write, so I'm not sure that's true- the loopy code is in std.fs.File readAll() and writeAll() related functions. And std.io.OutStream.write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... we do need a loop here. A partial read (non 0) should be followed by another read right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, any looping is redundant since the function can return partial writes, and so the callsite must do looping or itself return partials. In fact, it's actually providing more power and information to return on the first partial read or partial write, since that exposes how the underlying mechanism works. As an example, this is exploited by command line input reading a line from the terminal.

@@ -82,6 +82,7 @@ pub const SYS_pread64 = 67;
pub const SYS_pwrite64 = 68;
pub const SYS_preadv = 69;
pub const SYS_pwritev = 70;
pub const SYS_sendfile = 71;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o.o how odd that this was missed!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... i made some confused noises out loud when I saw that, lol

const amt_read = try pread(in_fd, buf[0..adjusted_count], in_offset);
if (amt_read == 0) {
if (count == 0) {
// We have detected EOF from `in_fd`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still write 0 bytes here.

This would have an effect if e.g. the previous "write" was a sendto without MSG_EOR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's value in considering this. Can you make that a separate proposal? I think all the read/write functions would need to be audited if this were the policy.

@andrewrk
Copy link
Member Author

andrewrk commented Mar 3, 2020

🤦‍♂️ CI is failing because / in branch name

@daurnimator
Copy link
Contributor

Will also fail on freebsd due to upstream bug (see my other pr)

@andrewrk
Copy link
Member Author

andrewrk commented Mar 3, 2020

=> #4612

@andrewrk andrewrk closed this Mar 3, 2020
@andrewrk andrewrk deleted the terinjokes-patches/sendfile branch March 3, 2020 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. enhancement Solving this issue will likely involve adding new logic or components to the codebase. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std.os.writev does not handle partial writes
3 participants