-
-
Notifications
You must be signed in to change notification settings - Fork 739
WIP: OS Random number generator interface #7619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your pull request and interest in making D better, @jpf91! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + phobos#7619" |
c1b2bc9
to
a94d8c6
Compare
Here's my initial getrandom implementation for mir: libmir/mir-random#13
private enum GET_RANDOM {
UNINTIALIZED,
NOT_AVAILABLE,
AVAILABLE,
}
// getrandom was introduced in Linux 3.17
private __gshared GET_RANDOM hasGetRandom = GET_RANDOM.UNINTIALIZED;
import core.sys.posix.sys.utsname : utsname;
// druntime isn't properly annotated
private extern(C) int uname(utsname* __name) @nogc nothrow;
// checks whether the Linux kernel supports getRandom by looking at the
// reported version
private auto initHasGetRandom() @nogc @trusted nothrow
{
import core.stdc.string : strtok;
import core.stdc.stdlib : atoi;
utsname uts;
uname(&uts);
char* p = uts.release.ptr;
// poor man's version check
auto token = strtok(p, ".");
int major = atoi(token);
if (major > 3)
return true;
if (major == 3)
{
token = strtok(p, ".");
if (atoi(token) >= 17)
return true;
}
return false;
} ...
private extern(C) int syscall(size_t ident, size_t n, size_t arg1, size_t arg2) @nogc nothrow;
/*
* Flags for getrandom(2)
*
* GRND_NONBLOCK Don't block and return EAGAIN instead
* GRND_RANDOM Use the /dev/random pool instead of /dev/urandom
*/
private enum GRND_NONBLOCK = 0x0001;
private enum GRND_RANDOM = 0x0002;
version (X86_64)
private enum GETRANDOM = 318;
version (X86)
private enum GETRANDOM = 355;
/*
http://man7.org/linux/man-pages/man2/getrandom.2.html
If the urandom source has been initialized, reads of up to 256 bytes
will always return as many bytes as requested and will not be
interrupted by signals. No such guarantees apply for larger buffer
sizes.
*/
private ptrdiff_t genRandomImplSysBlocking(void* ptr, size_t len) @nogc @trusted nothrow
{
while (len > 0)
{
auto res = syscall(GETRANDOM, cast(size_t) ptr, len, 0);
if (res >= 0)
{
len -= res;
ptr += res;
}
else
{
return res;
}
}
return 0;
}
/*
* If the GRND_NONBLOCK flag is set, then
* getrandom() does not block in these cases, but instead
* immediately returns -1 with errno set to EAGAIN.
*/
private ptrdiff_t genRandomImplSysNonBlocking(void* ptr, size_t len) @nogc @trusted nothrow
{
return syscall(GETRANDOM, cast(size_t) ptr, len, GRND_NONBLOCK);
}
Why not use
|
Shouldn't it be the other way around? Merge the secure random number generator before issuing a deprecation (as otherwise there's no way to handle the deprecation for the user). |
Depends on how long it takes to get this into a mergeable state, as implementing these interfaces is more work than simply notifying the user that currently there is no way to securely generate UUIDs in phobos. randomUUID was never meant to provide UUIDs for cryptographic use and delaying the warning if we cannot get this ready in time is a bad idea. |
I guess using the C fopen API should be ok, even though I think a recent phobos design goal was to move away from C library apis. But I guess we can still replace this with |
Partly OT: Phobos is in maintenance mode only, so it's much more likely to have a Phobos v2.
It doesn't help the user if there's no good actionable path and at least with all other deprecations we had the guideline that they should only be triggered if there's a replacement or good migration path. |
I rewrote the windows code to just initialize a new handle in every thread. According to some stackexchange discussions, this handle is just a pointer to some memory block, so there should be no real overhead when creating multiple of these handles. I also updated the file code to use fopen & co. The code still uses exceptions for error reporting, so it can't be 100% nogc (maybe with statically allocaed exceptions) / nothrow. But I'm not sure what would be a good, idiomatic alternative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original Mir implementation was only intended to provide a an unpredictableSeed
for initialization. If we use this as an engine, more bytes should be fetched per syscall.
std/random.d
Outdated
|
||
void popFront() @trusted nothrow @nogc | ||
{ | ||
if (!CryptGenRandom(_hProvider, front.sizeof, cast(PBYTE) &front)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we fetch more bytes?
https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptgenrandom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean keeping it in a local buffer? I think we can't really return more data, as far as I know the random API expects front to be an integer type.
But what kind of size would be best here? The MSDN docs don't seem to provide that information?
BTW, that's the reason I feel uncomfortable doing all this crypto stuff, you have to double check every single detail and you might still miss something: For example, I'm not sure if we can/should just keep data which is used for cryptographic purposes in "unprotected" buffers (e.g. buffers which can be swapped out). But I guess it's the same for front already and the current RND generator API / range API doesn't allow more safe interfaces (where we only write to a user-specified destination).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, caching anything locally in the struct also means that everything returned by osRNDGen
needs to be somehow by reference, to ensure we never return duplicated data. It could return by reference, but then it's not possible to store it in a local variable.... RefCounted
seems a bit overkill. Does Unique work well enough for such cases nowadays? Last time I tried, Unique ranges were often not working in many parts of phobos...
std/random.d
Outdated
|
||
void popFront() @safe nothrow @nogc | ||
{ | ||
arc4random_buf(&front, front.sizeof); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can fetch more bytes here:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how much makes sense here? 64, 128 byte?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should benchmark this, but https://github.com/tcort/arc4random uses 256
by default and that's what my gut feeling suggested as well.
std/random.d
Outdated
|
||
void popFront() @trusted nothrow | ||
{ | ||
auto res = fread(&front, 1, front.sizeof, _file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, more bytes could be fetched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C IO functions buffer internally though? So I guess the only benefit may be some inlining advantages, but is this worth the increased memory requirements by double buffering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpf91 is right, fread
is using FILE *
, and so is buffering already.
std/random.d
Outdated
Returns: | ||
A singleton instance of the operating system secure random number generator | ||
+/ | ||
@property ref OSRandom osRNDGen() @safe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about defining a buffer size?
@property ref OSRandom osRNDGen() @safe | |
@property ref OSRandom osRNDGen(size_t bufferSize = 256) @safe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good idea. Though this lazy initialization is weird then, as the parameter will be evaluated only on the first call. There's also the problem that the initialization can actually cause more / different errors than fetching the data. So splitting these calls would make sense, but having initialization functions is not really phobos style...
Alternatively, we could not provide a default instance at all, but I think for this use case, we really want a thread-wide default.
std/random.d
Outdated
} | ||
|
||
// TLS data is not destructed anyway right now... | ||
version (none) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really an argument to use version (none)
?
At least without version (none)
the code in the destructor will get checked for errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to make the osRNDGen
return value reference counted then, otherwise the destructor would be called multiple times because of copies. And given that the value is never destructed anyway, this seems like an unnecessary complication. Do RefCounted!T ranges even work properly in all cases?
I could also remove the destructor completely. Only if phobos is used as a shared library (e.g. you load a D plugin into a C host), then you would expect these resources to be properly freed. But it doesn't seem likely destruction of static values will be implemented anytime soon...
tl;dr: if you want this to be
|
Nice, I didn't know that static Exception implementation is that simple nowadays. Back when I used it in druntime, it needed lots of manual hacks 😄 I'll use that then: In theory on linux, However, this brings me back to the lazy initialization in try
auto rng = initOSRNG();
catch ()...
auto id1 = rng.randomUUID();
auto id2 = rng.randomUUID(); Instead of this: try
auto id1 = osRNDGen.randomUUID();
catch ()...
try
auto id2 = osRNDGen.randomUUID();
catch ()... |
FWIW I'm not convinced that adding caching is a good idea here. It certainly makes things faster but it also introduces complexity in something that benefits from being kept simple IMHO. Different pieces of code should focus on different things, and performance should not be the focus on this one: security should. I would rather keep the code free from any unnecessary concern and let end-users cache on their end if they feel the need to. I think people would expect security over performance in this case and keeping things simple, just exposing the system's RNG to the user without increasing the amount of possible bugs, is the best way to meet that expectation. |
An interface to use the operating system's secure random number generator has been added to `std.random`. The OS RNG is accessed using the `osRNDGen` global instance and adheres to the `std.random` RNG interface.
6bcea66
to
ac154c0
Compare
You mention FreeBSD uses There is also On Windows, pulling in the CryptAPI is usually advised against unless it's being used elsewhere. Instead, it's generally advisable to use As a minor note, |
@jpf91 What are your plans regarding this PR? |
ping @jpf91 |
Hi @RazvanN7, I unfortunately don't really have the time to properly finish this right now. And I think as this involves some API design work and especially API design work regarding critical security aspects, it'd be a bad idea to rush this. I'll just close this PR for now, but if anyone wants to implement OS RNGs in phobos, feel free to use this as a base to build on. (Also it might make more sense to this as part of Phobos V2?) |
Note: This PR is based on #7618 and should be merged only after that PR.
An interface to use the operating system's secure random number generator has been added to
std.random
. The OS RNG is accessed using theosRNDGen
global instance and adheres to thestd.random
RNG interface.This is required to allow generating secure UUIDs in
std.uuid
. Thestd.uuid
documentation has been updated in this PR to show how to use the secure RNG interface.Open Issues:
shared static this
and it also does not work for this module anyway (Cyclic dependency between module constructors/destructors of std.random and std.parallelism
). In C/C++/GCC druntime, I'd just do lazy initialization guarded bypthread_once
or using a statically initialized mutex. But afaik, both options are not available in phobos? How do you guys usually handle initialization of shared resources in phobos?/dev/urandom
implementation? I guess we don't want to reinvent std.stdio.file in Phobos, but this means that the interface can't be nogc, asFile
is not nogc.... We really needstd.io/iopipe
in phobos...