Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File descriptor leak in context constructor #2632

Closed
hoditohod opened this issue Jul 26, 2017 · 2 comments · Fixed by #2636
Closed

File descriptor leak in context constructor #2632

hoditohod opened this issue Jul 26, 2017 · 2 comments · Fixed by #2636

Comments

@hoditohod
Copy link
Contributor

Hi libzmq guys!

Issue

I have a test that spawns a few threads, creates a zmq context in each of them, does some boring stuff, then destroys the context and joins the threads. The test application leaks a few fd's to /dev/urandom in each iteration.

Env

Ubuntu 16.04 64bit
libzmq 4.2.2 + cppzmq

Root cause

The libzmq build I use has the ZMQ_USE_TWEETNACL undefined, so randombytes() directly reads /dev/urandom storing the fd in global static variable here without any protection against concurrent execution. When calling the context constructor concurrently from multiple threads, the scoped_lock_t here is useless because the corresponding mutex is a class member (each instance has one mutex), and it doesn't prevent multiple concurrent context instances calling randombytes().

As a result multiple threads open the file but only one close is issued.

@bluca
Copy link
Member

bluca commented Jul 26, 2017

Thanks for the report, I can see the issue. We'll have to lock&refcount I guess.

In the meanwhile, you can use 4.2.2 built with libsodium with packages for 16.04 from OBS: http://download.opensuse.org/repositories/network:/messaging:/zeromq:/release-stable/xUbuntu_16.04/

bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: check for thread-local storage support in the compiler and
if available use it for the global file descriptor variable used to
open /dev/urandom in the tweetnacl code.
if TLS is not available fallback to wrapping randombytes and
randombytes_close into critical sections using a pthread mutex, to
forcefully serialize open and close.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: check for thread-local storage support in the compiler and
if available use it for the global file descriptor variable used to
open /dev/urandom in the tweetnacl code.
if TLS is not available fallback to wrapping randombytes and
randombytes_close into critical sections using a pthread mutex, to
forcefully serialize open and close.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 27, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 28, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
bluca added a commit to bluca/libzmq that referenced this issue Jul 28, 2017
Solution: add a crypto [de-]initialiser, refcounted and serialised
through critical sections.
This is necessary as utility APIs such as zmq_curve_keypair also
call into the sodium/tweetnacl libraries and need the initialisation
outside of the zmq context.
Also the libsodium documentation explicitly says that sodium_init
must not be called concurrently from multiple threads, which could
have happened until now. Also the randombytes_close function does
not appear to be thread safe either.
This change guarantees that the library is initialised only once at
any given time across the whole program.
Fixes zeromq#2632
@bluca
Copy link
Member

bluca commented Jul 28, 2017

@hoditohod please try again from latest master

bluca added a commit to bluca/libzmq that referenced this issue Mar 19, 2018
Solution: restrict it only to the original issue zeromq#2632, Tweetnacl on
*NIX when using /dev/urandom, ie: without the new Linux getrandom()
syscall.

Existing applications might use atexit to register cleanup functions
(like CZMQ does), and the current change as-is imposes an ordering
that did not exist before - the context MUST be created BEFORE
registering the cleanup with atexit. This is a backward incompatible
change that is reported to cause aborts in some applications.

Although libsodium's documentation says that its initialisation APIs
is not thread-safe, nobody has ever reported an issue with it, so
avoiding the global init/deinit in the libsodium case is the less
risky option we have.

Tweetnacl users on Windows and on Linux with getrandom (glibc 2.25 and
Linux kernel 3.17) are not affected by the original issue.

Fixes zeromq#2991
bluca added a commit to bluca/libzmq that referenced this issue Mar 19, 2018
Solution: restrict it only to the original issue zeromq#2632, Tweetnacl on
*NIX when using /dev/urandom, ie: without the new Linux getrandom()
syscall.

Existing applications might use atexit to register cleanup functions
(like CZMQ does), and the current change as-is imposes an ordering
that did not exist before - the context MUST be created BEFORE
registering the cleanup with atexit. This is a backward incompatible
change that is reported to cause aborts in some applications.

Although libsodium's documentation says that its initialisation APIs
is not thread-safe, nobody has ever reported an issue with it, so
avoiding the global init/deinit in the libsodium case is the less
risky option we have.

Tweetnacl users on Windows and on Linux with getrandom (glibc 2.25 and
Linux kernel 3.17) are not affected by the original issue.

Fixes zeromq#2991
bluca added a commit to bluca/libzmq that referenced this issue Mar 19, 2018
Solution: restrict it only to the original issue zeromq#2632, Tweetnacl on
*NIX when using /dev/urandom, ie: without the new Linux getrandom()
syscall.

Existing applications might use atexit to register cleanup functions
(like CZMQ does), and the current change as-is imposes an ordering
that did not exist before - the context MUST be created BEFORE
registering the cleanup with atexit. This is a backward incompatible
change that is reported to cause aborts in some applications.

Although libsodium's documentation says that its initialisation APIs
is not thread-safe, nobody has ever reported an issue with it, so
avoiding the global init/deinit in the libsodium case is the less
risky option we have.

Tweetnacl users on Windows and on Linux with getrandom (glibc 2.25 and
Linux kernel 3.17) are not affected by the original issue.

Fixes zeromq#2991
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants