Skip to content

WIP: OS Random number generator interface #7619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

jpf91
Copy link
Contributor

@jpf91 jpf91 commented Sep 6, 2020

Note: This PR is based on #7618 and should be merged only after that PR.

An interface to use the operating system's secure random number generator has been added to std.random. The OS RNG is accessed using the osRNDGen global instance and adheres to the std.random RNG interface.

import std.random;
auto secureValue = osRNDGen.dice(70, 20, 10);

This is required to allow generating secure UUIDs in std.uuid. The std.uuid documentation has been updated in this PR to show how to use the secure RNG interface.

Open Issues:

  • How to handle the static initialization required for e.g. the windows RNG? I think we generally want to avoid shared static this and it also does not work for this module anyway (Cyclic dependency between module constructors/destructors of std.random and std.parallelism). In C/C++/GCC druntime, I'd just do lazy initialization guarded by pthread_once or using a statically initialized mutex. But afaik, both options are not available in phobos? How do you guys usually handle initialization of shared resources in phobos?
  • How to do the /dev/urandom implementation? I guess we don't want to reinvent std.stdio.file in Phobos, but this means that the interface can't be nogc, as File is not nogc.... We really need std.io/iopipe in phobos...
  • I've not included the linux getrandom version. Properly versioning this for old kernel versions may be some work and the file based version is required for FreeBSD anyway.

@dlang-bot
Copy link
Contributor

dlang-bot commented Sep 6, 2020

Thanks for your pull request and interest in making D better, @jpf91! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + phobos#7619"

@jpf91 jpf91 changed the title WIP: WIP: OS Random generator interface Sep 6, 2020
@jpf91 jpf91 force-pushed the sysRandom branch 4 times, most recently from c1b2bc9 to a94d8c6 Compare September 6, 2020 11:29
@jpf91 jpf91 changed the title WIP: OS Random generator interface WIP: OS Random number generator interface Sep 6, 2020
@wilzbach
Copy link
Member

wilzbach commented Sep 6, 2020

How do you guys usually handle initialization of shared resources in phobos?

I've not included the linux getrandom version. Properly versioning this for old kernel versions may be some work and the file based version is required for FreeBSD anyway.

Here's my initial getrandom implementation for mir: libmir/mir-random#13
It has been changed a bit since, but it boils down to two units.

  1. Check for kernel support
private enum GET_RANDOM {
	UNINTIALIZED,
	NOT_AVAILABLE,
	AVAILABLE,
}

// getrandom was introduced in Linux 3.17
private __gshared GET_RANDOM hasGetRandom = GET_RANDOM.UNINTIALIZED;

import core.sys.posix.sys.utsname : utsname;

// druntime isn't properly annotated
private extern(C) int uname(utsname* __name) @nogc nothrow;

// checks whether the Linux kernel supports getRandom by looking at the
// reported version
private auto initHasGetRandom() @nogc @trusted nothrow
{
	import core.stdc.string : strtok;
	import core.stdc.stdlib : atoi;

	utsname uts;
	uname(&uts);
	char* p = uts.release.ptr;

	// poor man's version check
	auto token = strtok(p, ".");
	int major = atoi(token);
	if (major  > 3)
		return true;

	if (major == 3)
	{
		token = strtok(p, ".");
		if (atoi(token) >= 17)
			return true;
	}

	return false;
}

...

with(GET_RANDOM)
{
	// Linux >= 3.17 has getRandom
	if (hasGetRandom == AVAILABLE)
		return genRandomImplSysBlocking(ptr, len);
	else
		return genRandomImplFileBlocking(ptr, len);
}
  1. Call the kernel
private extern(C) int syscall(size_t ident, size_t n, size_t arg1, size_t arg2) @nogc nothrow;

/*
 * Flags for getrandom(2)
 *
 * GRND_NONBLOCK    Don't block and return EAGAIN instead
 * GRND_RANDOM      Use the /dev/random pool instead of /dev/urandom
 */
private enum GRND_NONBLOCK = 0x0001;
private enum GRND_RANDOM = 0x0002;

version (X86_64)
	private enum GETRANDOM = 318;
version (X86)
	private enum GETRANDOM = 355;

/*
	http://man7.org/linux/man-pages/man2/getrandom.2.html
	If the urandom source has been initialized, reads of up to 256 bytes
	will always return as many bytes as requested and will not be
	interrupted by signals.  No such guarantees apply for larger buffer
	sizes.
*/
private ptrdiff_t genRandomImplSysBlocking(void* ptr, size_t len) @nogc @trusted nothrow
{
	while (len > 0)
	{
		auto res = syscall(GETRANDOM, cast(size_t) ptr, len, 0);
		if (res >= 0)
		{
			len -= res;
			ptr += res;
		}
		else
		{
			return res;
		}
	}
	return 0;
}

/*
*   If the GRND_NONBLOCK flag is set, then
*   getrandom() does not block in these cases, but instead
*   immediately returns -1 with errno set to EAGAIN.
*/
private ptrdiff_t genRandomImplSysNonBlocking(void* ptr, size_t len) @nogc @trusted nothrow
{
	return syscall(GETRANDOM, cast(size_t) ptr, len, GRND_NONBLOCK);
}

How to do the /dev/urandom implementation? I guess we don't want to reinvent std.stdio.file in Phobos, but this means that the interface can't be nogc, as File is not nogc.... We really need std.io/iopipe in phobos...

Why not use fopen and friends as done for Mir?

import core.stdc.stdio : fclose, feof, ferror, fopen, fread;
alias IOType = typeof(fopen("a", "b"));
private __gshared IOType fdRandom;
private __gshared IOType fdURandom;

///
extern(C) shared static ~this()
{
	if (fdRandom !is null)
		fdRandom.fclose;

	if (fdURandom !is null)
		fdURandom.fclose;
}

/* The /dev/random device is a legacy interface which dates back to a
   time where the cryptographic primitives used in the implementation of
   /dev/urandom were not widely trusted.  It will return random bytes
   only within the estimated number of bits of fresh noise in the
   entropy pool, blocking if necessary.  /dev/random is suitable for
   applications that need high quality randomness, and can afford
   indeterminate delays.
   When the entropy pool is empty, reads from /dev/random will block
   until additional environmental noise is gathered.
*/
private ptrdiff_t genRandomImplFileBlocking(void* ptr, size_t len) @nogc @trusted nothrow
{
	if (fdRandom is null)
	{
		fdRandom = fopen("/dev/random", "r");
		if (fdRandom is null)
			return -1;
	}

	while (len > 0)
	{
		auto res = fread(ptr, 1, len, fdRandom);
		len -= res;
		ptr += res;
		// check for possible permanent errors
		if (len != 0)
		{
			if (fdRandom.ferror)
				return -1;

			if (fdRandom.feof)
				return -1;
		}
	}

	return 0;
}

/**
   When read, the /dev/urandom device returns random bytes using a
   pseudorandom number generator seeded from the entropy pool.  Reads
   from this device do not block (i.e., the CPU is not yielded), but can
   incur an appreciable delay when requesting large amounts of data.
   When read during early boot time, /dev/urandom may return data prior
   to the entropy pool being initialized.
*/
private ptrdiff_t genRandomImplFileNonBlocking(void* ptr, size_t len) @nogc @trusted nothrow
{
	if (fdURandom is null)
	{
		fdURandom = fopen("/dev/urandom", "r");
		if (fdURandom is null)
			return -1;
	}

	auto res = fread(ptr, 1, len, fdURandom);
	// check for possible errors
	if (res != len)
	{
		if (fdURandom.ferror)
			return -1;

		if (fdURandom.feof)
			return -1;
	}
	return res;
}

@wilzbach
Copy link
Member

wilzbach commented Sep 6, 2020

Note: This PR is based on #7618 and should be merged only after that PR.

Shouldn't it be the other way around? Merge the secure random number generator before issuing a deprecation (as otherwise there's no way to handle the deprecation for the user).

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

Depends on how long it takes to get this into a mergeable state, as implementing these interfaces is more work than simply notifying the user that currently there is no way to securely generate UUIDs in phobos.

randomUUID was never meant to provide UUIDs for cryptographic use and delaying the warning if we cannot get this ready in time is a bad idea.

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

I guess using the C fopen API should be ok, even though I think a recent phobos design goal was to move away from C library apis. But I guess we can still replace this with std.io, if we ever get a new std.io in phobos.

@wilzbach
Copy link
Member

wilzbach commented Sep 6, 2020

But I guess we can still replace this with std.io, if we ever get a new std.io in phobos.

Partly OT: Phobos is in maintenance mode only, so it's much more likely to have a Phobos v2.

than simply notifying the user that currently there is no way to securely generate UUIDs in phobos.

It doesn't help the user if there's no good actionable path and at least with all other deprecations we had the guideline that they should only be triggered if there's a replacement or good migration path.

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 6, 2020

I rewrote the windows code to just initialize a new handle in every thread. According to some stackexchange discussions, this handle is just a pointer to some memory block, so there should be no real overhead when creating multiple of these handles.

I also updated the file code to use fopen & co. The code still uses exceptions for error reporting, so it can't be 100% nogc (maybe with statically allocaed exceptions) / nothrow. But I'm not sure what would be a good, idiomatic alternative.

Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original Mir implementation was only intended to provide a an unpredictableSeed for initialization. If we use this as an engine, more bytes should be fetched per syscall.

std/random.d Outdated

void popFront() @trusted nothrow @nogc
{
if (!CryptGenRandom(_hProvider, front.sizeof, cast(PBYTE) &front))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@jpf91 jpf91 Sep 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean keeping it in a local buffer? I think we can't really return more data, as far as I know the random API expects front to be an integer type.

But what kind of size would be best here? The MSDN docs don't seem to provide that information?

BTW, that's the reason I feel uncomfortable doing all this crypto stuff, you have to double check every single detail and you might still miss something: For example, I'm not sure if we can/should just keep data which is used for cryptographic purposes in "unprotected" buffers (e.g. buffers which can be swapped out). But I guess it's the same for front already and the current RND generator API / range API doesn't allow more safe interfaces (where we only write to a user-specified destination).

Copy link
Contributor Author

@jpf91 jpf91 Sep 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, caching anything locally in the struct also means that everything returned by osRNDGen needs to be somehow by reference, to ensure we never return duplicated data. It could return by reference, but then it's not possible to store it in a local variable.... RefCounted seems a bit overkill. Does Unique work well enough for such cases nowadays? Last time I tried, Unique ranges were often not working in many parts of phobos...

std/random.d Outdated

void popFront() @safe nothrow @nogc
{
arc4random_buf(&front, front.sizeof);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can fetch more bytes here:

https://linux.die.net/man/3/arc4random_buf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea how much makes sense here? 64, 128 byte?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should benchmark this, but https://github.com/tcort/arc4random uses 256 by default and that's what my gut feeling suggested as well.

std/random.d Outdated

void popFront() @trusted nothrow
{
auto res = fread(&front, 1, front.sizeof, _file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, more bytes could be fetched.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C IO functions buffer internally though? So I guess the only benefit may be some inlining advantages, but is this worth the increased memory requirements by double buffering?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpf91 is right, fread is using FILE *, and so is buffering already.

std/random.d Outdated
Returns:
A singleton instance of the operating system secure random number generator
+/
@property ref OSRandom osRNDGen() @safe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about defining a buffer size?

Suggested change
@property ref OSRandom osRNDGen() @safe
@property ref OSRandom osRNDGen(size_t bufferSize = 256) @safe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good idea. Though this lazy initialization is weird then, as the parameter will be evaluated only on the first call. There's also the problem that the initialization can actually cause more / different errors than fetching the data. So splitting these calls would make sense, but having initialization functions is not really phobos style...

Alternatively, we could not provide a default instance at all, but I think for this use case, we really want a thread-wide default.

std/random.d Outdated
}

// TLS data is not destructed anyway right now...
version (none)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really an argument to use version (none)?
At least without version (none) the code in the destructor will get checked for errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to make the osRNDGen return value reference counted then, otherwise the destructor would be called multiple times because of copies. And given that the value is never destructed anyway, this seems like an unnecessary complication. Do RefCounted!T ranges even work properly in all cases?

I could also remove the destructor completely. Only if phobos is used as a shared library (e.g. you load a D plugin into a C host), then you would expect these resources to be properly freed. But it doesn't seem likely destruction of static values will be implemented anytime soon...

@wilzbach
Copy link
Member

wilzbach commented Sep 6, 2020

The code still uses exceptions for error reporting, so it can't be 100% nogc (maybe with statically allocaed exceptions) / nothrow. But I'm not sure what would be a good, idiomatic alternative.

tl;dr: if you want this to be @nogc, at the moment you really only have the options you mentioned:

@jpf91
Copy link
Contributor Author

jpf91 commented Sep 12, 2020

Nice, I didn't know that static Exception implementation is that simple nowadays. Back when I used it in druntime, it needed lots of manual hacks 😄

I'll use that then: In theory on linux, /dev/urandom may not exist in some cases. For example in a chroot, or if udev did not create the file properly.

However, this brings me back to the lazy initialization in osRNDGen: If you wanted to handle such an exception, you'd have to do that everywhere osRNDGen is used. Maybe it's better to split initialization and force users to call it (no global accessible variable). Repeated calls could return a cached value. So the implementation is basically the same, but users are then expected to do this:

try
    auto rng = initOSRNG();
catch ()...
    
auto id1 = rng.randomUUID();
auto id2 = rng.randomUUID();

Instead of this:

try
    auto id1 = osRNDGen.randomUUID();
catch ()...
try
    auto id2 = osRNDGen.randomUUID();
catch ()...

@cym13
Copy link

cym13 commented Sep 12, 2020

FWIW I'm not convinced that adding caching is a good idea here. It certainly makes things faster but it also introduces complexity in something that benefits from being kept simple IMHO. Different pieces of code should focus on different things, and performance should not be the focus on this one: security should. I would rather keep the code free from any unnecessary concern and let end-users cache on their end if they feel the need to.

I think people would expect security over performance in this case and keeping things simple, just exposing the system's RNG to the user without increasing the amount of possible bugs, is the best way to meet that expectation.

An interface to use the operating system's secure random number
generator has been added to `std.random`. The OS RNG is accessed using
the `osRNDGen` global instance and adheres to the `std.random` RNG
interface.
@jpf91 jpf91 force-pushed the sysRandom branch 2 times, most recently from 6bcea66 to ac154c0 Compare September 14, 2020 07:51
@euantorano
Copy link
Contributor

You mention FreeBSD uses /dev/urandom, so I just wanted to draw attention to FreeBSD's getrandom(2). This has been available since FreeBSD 12.

There is also getentropy(3) which is based on getrandom and also appeared in FreeBSD 12.

On Windows, pulling in the CryptAPI is usually advised against unless it's being used elsewhere. Instead, it's generally advisable to use RtlGenRandom instead, which has the added bonus of having a far simpler API. See for example this post.

As a minor note, libsodium is a pretty good reference point.

@RazvanN7
Copy link
Collaborator

@jpf91 What are your plans regarding this PR?

@RazvanN7
Copy link
Collaborator

RazvanN7 commented Jan 7, 2022

ping @jpf91

@jpf91
Copy link
Contributor Author

jpf91 commented Jan 8, 2022

Hi @RazvanN7,

I unfortunately don't really have the time to properly finish this right now. And I think as this involves some API design work and especially API design work regarding critical security aspects, it'd be a bad idea to rush this.

I'll just close this PR for now, but if anyone wants to implement OS RNGs in phobos, feel free to use this as a base to build on. (Also it might make more sense to this as part of Phobos V2?)

@jpf91 jpf91 closed this Jan 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants