Skip to content

Add support for sendmsg(2), recvmsg(2), and cmsg(4) #179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 28, 2015

Conversation

geofft
Copy link
Contributor

@geofft geofft commented Aug 27, 2015

The best specification for control message layout appears to be RFC 2292, section 4, despite this not being a wire protocol. These definitions have also been checked against glibc 2.19 <bits/socket.h> and Linux 4.0 <linux/socket.h>, and tested on Debian 8.1 and FreeBSD 10.2 x86_64.

The API differs a bit from the cmsg(4) API for type-safety reasons (and also because the cmsg(4) API is terrible). See test/sys/test_socket.rs for an example.

Only supports SCM_RIGHTS at the moment.

Fixes #88.

@geofft
Copy link
Contributor Author

geofft commented Aug 27, 2015

I'm pretty open to review of the approach here; there are a couple of different ways you could go for exposing safe wrappers on cmsg(4), which is mostly complicated 'cause ancillary data is essentially an array of variable-length control messages. I've chosen not to directly expose struct msghdr or a safe equivalent and instead just fold things into the parameters or return type of sendmsg / recvmsg directly, because struct msghdr is pretty annoying with both lifetimes and mutability. (And I assume it dates from ABIs with limits on how many parameters a function could have.) I'd also appreciate review for safety, though I'm pretty confident the exposed interface is safe.

The function copy_bytes is generic enough to live somewhere else. (I sort of wish the stdlib had something like it, to be honest.)

It looks like at least some tests are failing because Rust 1.0.0 is stricter about Sized... if compatibility with 1.0 is important, I can try to rework things. I'm running 1.1.0 locally and it's fine.

@carllerche
Copy link
Contributor

I'm OK w/ bumping to 1.1.0. Feel free to update your PR w/ the .travis.yml change.

@carllerche
Copy link
Contributor

Oh oh.. still not passing CI :)

@geofft
Copy link
Contributor Author

geofft commented Sep 18, 2015

Fixed and rebased on top of master. I was using size_t instead of the conditionally-defined type_of_cmsg_len in cmsg_align(). I also dropped the test check_validity_of_cmsg_align, since it's testing the wrong thing (alignment of cmsghdr, not of its data field).

I wish we had a better way to test these sorts of things, but for an API this gnarly (mostly C preprocessor macros), running it and seeing if it works may be the best option.

The best specification for control message layout appears to be
[RFC 2292, section 4](https://tools.ietf.org/html/rfc2292#section-4),
despite this not being a wire protocol. These definitions have also been
checked against glibc 2.19 <bits/socket.h> and Linux 4.0
<linux/socket.h>, and tested on Debian 8.1 and FreeBSD 10.2 x86_64.

The API differs a bit from the cmsg(4) API for type-safety reasons (and
also because the cmsg(4) API is terrible). See test/sys/test_socket.rs
for an example.

Only supports SCM_RIGHTS at the moment.

Fixes nix-rust#88.
@carllerche carllerche merged commit 6e27e16 into nix-rust:master Sep 28, 2015
@carllerche
Copy link
Contributor

Thanks!

@carllerche
Copy link
Contributor

I had to revert (5948204) this. It looks like the commit caused some memory problems (made evident by jemalloc hanging in some cases).

I ran valgrind, here is the relevant output:

==18310== Syscall param sendmsg(msg.msg_control) points to uninitialised byte(s)
==18310==    at 0x23F3AE: sendmsg (in /usr/lib/system/libsystem_kernel.dylib)
==18310==    by 0x1000093A5: sys::test_socket::test_scm_rights::h648755a0a38bb4523na (<std macros>:86)
==18310==    by 0x10006A9C0: boxed::F.FnBox$LT$A$GT$::call_box::h1438562417738739817 (in target/debug/test-f87014de17f13022)
==18310==    by 0x10006DC1F: boxed::F.FnBox$LT$A$GT$::call_box::h12722119255609577984 (in target/debug/test-f87014de17f13022)
==18310==    by 0x10006B227: rt::unwind::try::try_fn::h4141482984882382853 (in target/debug/test-f87014de17f13022)
==18310==    by 0x1000A2CF8: rust_try_inner (in target/debug/test-f87014de17f13022)
==18310==    by 0x10009F9FD: sys::thread::Thread::new::thread_start::hefa4f0e4c64c9336Yjv (in target/debug/test-f87014de17f13022)
==18310==    by 0xD6771: _pthread_start (in /usr/lib/system/libsystem_c.dylib)
==18310==    by 0xC31A0: thread_start (in /usr/lib/system/libsystem_c.dylib)
==18310==  Address 0x3c1d00c is 12 bytes inside a block of size 16 alloc'd
==18310==    at 0x1000A595D: je_mallocx (jemalloc.c:1560)
==18310==    by 0x10002BD39: heap::imp::allocate::he09e536dcaf3072cJha (heap.rs:244)
==18310==    by 0x10002BC91: heap::allocate::h0eb62c8e5e9e5dbbLca (heap.rs:32)
==18310==    by 0x10002EC1C: vec::Vec$LT$T$GT$::with_capacity::h16089182580315345158 (vec.rs:222)
==18310==    by 0x10002DEDD: sys::socket::sendmsg::h79748e1166d9c6996bg (mod.rs:295)
==18310==    by 0x1000093A5: sys::test_socket::test_scm_rights::h648755a0a38bb4523na (<std macros>:86)
==18310==    by 0x10006A9C0: boxed::F.FnBox$LT$A$GT$::call_box::h1438562417738739817 (in target/debug/test-f87014de17f13022)
==18310==    by 0x10006DC1F: boxed::F.FnBox$LT$A$GT$::call_box::h12722119255609577984 (in target/debug/test-f87014de17f13022)
==18310==    by 0x10006B227: rt::unwind::try::try_fn::h4141482984882382853 (in target/debug/test-f87014de17f13022)
==18310==    by 0x1000A2CF8: rust_try_inner (in target/debug/test-f87014de17f13022)
==18310==    by 0x10009F9FD: sys::thread::Thread::new::thread_start::hefa4f0e4c64c9336Yjv (in target/debug/test-f87014de17f13022)
==18310==    by 0xD6771: _pthread_start (in /usr/lib/system/libsystem_c.dylib)
==18310==

@geofft
Copy link
Contributor Author

geofft commented Sep 28, 2015

Bizarre! I can reproduce this in valgrind, let me see what I did wrong.

@geofft
Copy link
Contributor Author

geofft commented Sep 30, 2015

@carllerche, are you easily able to test/reproduce this this? Can you try cherry-picking geofft/nix-rust@1574408 and seeing if it avoids the problem you see? (My OS X access at the moment is limited, so if you can reproduce both the valgrind problem and the hang reliably, that would be helpful to me.)

It does clear up a similar valgrind warning on Linux, but I think there were two mistakes in this PR (both fixed in that commit), and I think Linux was only running into one of them.

@geofft
Copy link
Contributor Author

geofft commented Oct 6, 2015

I got an OS X valgrind and tested the patch, and it in fact eliminates the valgrind error. Resubmitted as #197.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants