Skip to content

Commit 6801bf1

Browse files
authored
Improve documentation (#135)
Signed-off-by: Tom Kaitchuck <Tom.Kaitchuck@gmail.com>
1 parent 8174160 commit 6801bf1

File tree

3 files changed

+164
-37
lines changed

3 files changed

+164
-37
lines changed

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -53,18 +53,20 @@ map.insert(56, 78);
5353
The aHash package has the following flags:
5454
* `std`: This enables features which require the standard library. (On by default) This includes providing the utility classes `AHashMap` and `AHashSet`.
5555
* `serde`: Enables `serde` support for the utility classes `AHashMap` and `AHashSet`.
56-
* `compile-time-rng`: Whenever possible aHash will seed hashers with random numbers using the [getrandom](https://github.com/rust-random/getrandom) crate.
57-
This is possible for OS targets which provide a source of randomness. (see the [full list](https://docs.rs/getrandom/0.2.0/getrandom/#supported-targets).)
58-
For OS targets without access to a random number generator, `compile-time-rng` provides an alternative.
56+
* `runtime-rng`: To obtain a seed for Hashers will obtain randomness from the operating system. (On by default)
57+
This is done using the [getrandom](https://github.com/rust-random/getrandom) crate.
58+
* `compile-time-rng`: For OS targets without access to a random number generator, `compile-time-rng` provides an alternative.
5959
If `getrandom` is unavailable and `compile-time-rng` is enabled, aHash will generate random numbers at compile time and embed them in the binary.
6060
This allows for DOS resistance even if there is no random number generator available at runtime (assuming the compiled binary is not public).
61-
This makes the binary non-deterministic, unless `getrandom` is available for the target in which case the flag does nothing.
62-
(If non-determinism is a problem see [constrandom's documentation](https://github.com/tkaitchuck/constrandom#deterministic-builds))
61+
This makes the binary non-deterministic. (If non-determinism is a problem see [constrandom's documentation](https://github.com/tkaitchuck/constrandom#deterministic-builds))
6362

64-
**NOTE:** If `getrandom` is unavailable and `compile-time-rng` is disabled aHash will fall back on using the numeric
65-
value of memory addresses as a source of randomness. This is somewhat strong if ALSR is turned on (it is by default)
66-
but for embedded platforms this will result in weak keys. As a result, it is recommended to use `compile-time-rng` anytime
67-
random numbers will not be available at runtime.
63+
If both `runtime-rng` and `compile-time-rng` are enabled the `runtime-rng` will take precedence and `compile-time-rng` will do nothing.
64+
65+
**NOTE:** If both `runtime-rng` and `compile-time-rng` a source of randomness may be provided by the application on startup
66+
using the [ahash::random_state::set_random_source](https://docs.rs/ahash/latest/ahash/random_state/fn.set_random_source.html) method.
67+
If neither flag is set and this is not done, aHash will fall back on using the numeric value of memory addresses as a source of randomness.
68+
This is somewhat strong if ALSR is turned on (it is by default) but for embedded platforms this will result in weak keys.
69+
As a result, it is recommended to use `compile-time-rng` anytime random numbers will not be available at runtime.
6870

6971
## Comparison with other hashers
7072

src/lib.rs

Lines changed: 47 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,24 @@
1-
//! AHash is a hashing algorithm is intended to be a high performance, (hardware specific), keyed hash function.
2-
//! This can be seen as a DOS resistant alternative to `FxHash`, or a fast equivalent to `SipHash`.
3-
//! It provides a high speed hash algorithm, but where the result is not predictable without knowing a Key.
4-
//! This allows it to be used in a `HashMap` without allowing for the possibility that an malicious user can
1+
//! AHash is a high performance keyed hash function.
2+
//!
3+
//! It is a DOS resistant alternative to `FxHash` or a faster alternative to `SipHash`.
4+
//!
5+
//! It quickly provides a high quality hash where the result is not predictable without knowing the Key.
6+
//! AHash works with `HashMap` to hash keys, but without allowing for the possibility that an malicious user can
57
//! induce a collision.
68
//!
79
//! # How aHash works
810
//!
9-
//! aHash uses the hardware AES instruction on x86 processors to provide a keyed hash function.
10-
//! aHash is not a cryptographically secure hash.
11+
//! When it is available aHash uses the hardware AES instructions to provide a keyed hash function.
12+
//! When it is not, aHash falls back on a slightly slower alternative algorithm.
1113
//!
14+
//! AHash does not have a fixed standard for its output. This allows it to improve over time.
15+
//! But this also means that different computers or computers using different versions of ahash will observe different
16+
//! hash values.
1217
#![cfg_attr(
1318
feature = "std",
1419
doc = r##"
15-
# Example
20+
# Usage
21+
AHash is a drop in replacement for the default implementation of the Hasher trait. To construct a HashMap using aHash as its hasher do the following:
1622
```
1723
use ahash::{AHasher, RandomState};
1824
use std::collections::HashMap;
@@ -25,25 +31,46 @@ map.insert(12, 34);
2531
#![cfg_attr(
2632
feature = "std",
2733
doc = r##"
28-
For convenience, both new-type wrappers and type aliases are provided. The new type wrappers are called called `AHashMap` and `AHashSet`. These do the same thing with slightly less typing.
29-
The type aliases are called `ahash::HashMap`, `ahash::HashSet` are also provided and alias the
30-
std::[HashMap] and std::[HashSet]. Why are there two options? The wrappers are convenient but
31-
can't be used where a generic `std::collection::HashMap<K, V, S>` is required.
34+
For convenience, both new-type wrappers and type aliases are provided.
3235
36+
The new type wrappers are called called `AHashMap` and `AHashSet`.
37+
These do the same thing with slightly less typing. (For convience `From`, `Into`, and `Deref` are provided).
3338
```
3439
use ahash::AHashMap;
3540
36-
let mut map: AHashMap<i32, i32> = AHashMap::with_capacity(4);
41+
let mut map: AHashMap<i32, i32> = AHashMap::new();
3742
map.insert(12, 34);
38-
map.insert(56, 78);
39-
// There are also type aliases provieded together with some extension traits to make
40-
// it more of a drop in replacement for the std::HashMap/HashSet
41-
use ahash::{HashMapExt, HashSetExt}; // Used to get with_capacity()
42-
let mut map = ahash::HashMap::with_capacity(10);
43+
```
44+
45+
For even less typing and better interop with existing libraries which require a `std::collection::HashMap` (such as rayon),
46+
the type aliases [HashMap], [HashSet] are provided. These alias the `std::HashMap` and `std::HashSet` using aHash as the hasher.
47+
48+
```
49+
use ahash::{HashMap, HashMapExt};
50+
51+
let mut map: HashMap<i32, i32> = HashMap::new();
4352
map.insert(12, 34);
44-
let mut set = ahash::HashSet::with_capacity(10);
45-
set.insert(10);
4653
```
54+
Note the import of [HashMapExt]. This is needed for the constructor.
55+
56+
# Directly hashing
57+
58+
Hashers can also be instantiated with `RandomState`. For example:
59+
```
60+
use std::hash::BuildHasher;
61+
use ahash::RandomState;
62+
63+
let hash_builder = RandomState::with_seed(42);
64+
let hash = hash_builder.hash_one("Some Data");
65+
```
66+
### Randomness
67+
68+
To ensure that each map has a unique set of keys aHash needs a source of randomness.
69+
Normally this is just obtained from the OS. (Or via the `compile-time-rng` flag)
70+
71+
If for some reason (such as fuzzing) an application wishes to supply all random seeds manually, this can be done via:
72+
[random_state::set_random_source].
73+
4774
"##
4875
)]
4976
#![deny(clippy::correctness, clippy::complexity, clippy::perf)]
@@ -157,7 +184,7 @@ where
157184
/// [AHasher]s in order to hash the keys of the map.
158185
///
159186
/// Generally it is preferable to use [RandomState] instead, so that different
160-
/// hashmaps will have different keys. However if fixed keys are desireable this
187+
/// hashmaps will have different keys. However if fixed keys are desirable this
161188
/// may be used instead.
162189
///
163190
/// # Example

src/random_state.rs

Lines changed: 106 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,12 @@ cfg_if::cfg_if! {
118118
}
119119
/// A supplier of Randomness used for different hashers.
120120
/// See [set_random_source].
121+
///
122+
/// If [set_random_source] aHash will default to the best available source of randomness.
123+
/// In order this is:
124+
/// 1. OS provided random number generator (available if the `runtime-rng` flag is enabled which it is by default)
125+
/// 2. Strong compile time random numbers used to permute a static "counter". (available if `compile-time-rng` is enabled. __Enabling this is recommended if `runtime-rng` is not possible__)
126+
/// 3. A static counter that adds the memory address of each [RandomState] created permuted with fixed constants. (Similar to above but with fixed keys)
121127
pub trait RandomSource {
122128
fn gen_hasher_seed(&self) -> usize;
123129
}
@@ -195,6 +201,16 @@ cfg_if::cfg_if! {
195201
/// [Hasher]: std::hash::Hasher
196202
/// [BuildHasher]: std::hash::BuildHasher
197203
/// [HashMap]: std::collections::HashMap
204+
///
205+
/// There are multiple constructors each is documented in more detail below:
206+
///
207+
/// | Constructor | Dynamically random? | Seed |
208+
/// |---------------|---------------------|------|
209+
/// |`new` | Each instance unique|_[RandomSource]_|
210+
/// |`generate_with`| Each instance unique|`u64` x 4 + static counter|
211+
/// |`with_seed` | Fixed per process |`u64` + static random number|
212+
/// |`with_seeds` | Fixed |`u64` x 4|
213+
///
198214
#[derive(Clone)]
199215
pub struct RandomState {
200216
pub(crate) k0: u64,
@@ -210,16 +226,26 @@ impl fmt::Debug for RandomState {
210226
}
211227

212228
impl RandomState {
213-
/// Use randomly generated keys
229+
230+
/// Create a new `RandomState` `BuildHasher` using random keys.
231+
///
232+
/// (Each instance will have a unique set of keys).
214233
#[inline]
215234
pub fn new() -> RandomState {
216235
let src = get_src();
217236
let fixed = get_fixed_seeds();
218237
Self::from_keys(&fixed[0], &fixed[1], src.gen_hasher_seed())
219238
}
220239

221-
/// Allows for supplying seeds, but each time it is called the resulting state will be different.
222-
/// This is done using a static counter, so it can safely be used with a fixed keys.
240+
/// Create a new `RandomState` `BuildHasher` based on the provided seeds, but in such a way
241+
/// that each time it is called the resulting state will be different and of high quality.
242+
/// This allows fixed constant or poor quality seeds to be provided without the problem of different
243+
/// `BuildHasher`s being identical or weak.
244+
///
245+
/// This is done via permuting the provided values with the value of a static counter and memory address.
246+
/// (This makes this method somewhat more expensive than `with_seeds` below which does not do this).
247+
///
248+
/// The provided values (k0-k3) do not need to be of high quality but they should not all be the same value.
223249
#[inline]
224250
pub fn generate_with(k0: u64, k1: u64, k2: u64, k3: u64) -> RandomState {
225251
let src = get_src();
@@ -252,7 +278,11 @@ impl RandomState {
252278
RandomState { k0, k1, k2, k3 }
253279
}
254280

255-
/// Allows for explicitly setting a seed to used.
281+
/// Build a `RandomState` from a single key. The provided key does not need to be of high quality,
282+
/// but all `RandomState`s created from the same key will produce identical hashers.
283+
/// (In contrast to `generate_with` above)
284+
///
285+
/// This allows for explicitly setting the seed to be used.
256286
///
257287
/// Note: This method does not require the provided seed to be strong.
258288
#[inline]
@@ -262,9 +292,13 @@ impl RandomState {
262292
}
263293

264294
/// Allows for explicitly setting the seeds to used.
295+
/// All `RandomState`s created with the same set of keys key will produce identical hashers.
296+
/// (In contrast to `generate_with` above)
265297
///
266-
/// Note: This method is robust against 0s being passed for one or more of the parameters
267-
/// or the same value being passed for more than one parameter.
298+
/// Note: If DOS resistance is desired one of these should be a decent quality random number.
299+
/// If 4 high quality random number are not cheaply available this method is robust against 0s being passed for
300+
/// one or more of the parameters or the same value being passed for more than one parameter.
301+
/// It is recommended to pass numbers in order from highest to lowest quality (if there is any difference).
268302
#[inline]
269303
pub const fn with_seeds(k0: u64, k1: u64, k2: u64, k3: u64) -> RandomState {
270304
RandomState {
@@ -275,7 +309,36 @@ impl RandomState {
275309
}
276310
}
277311

278-
/// Calculates the hash of a single value.
312+
/// Calculates the hash of a single value. This provides a more convenient (and faster) way to obtain a hash:
313+
/// For example:
314+
#[cfg_attr(
315+
feature = "std",
316+
doc = r##" # Examples
317+
```
318+
use std::hash::BuildHasher;
319+
use ahash::RandomState;
320+
321+
let hash_builder = RandomState::new();
322+
let hash = hash_builder.hash_one("Some Data");
323+
```
324+
"##
325+
)]
326+
/// This is similar to:
327+
#[cfg_attr(
328+
feature = "std",
329+
doc = r##" # Examples
330+
```
331+
use std::hash::{BuildHasher, Hash, Hasher};
332+
use ahash::RandomState;
333+
334+
let hash_builder = RandomState::new();
335+
let mut hasher = hash_builder.build_hasher();
336+
"Some Data".hash(&mut hasher);
337+
let hash = hasher.finish();
338+
```
339+
"##
340+
)]
341+
/// (Note that these two ways to get a hash may not produce the same value for the same data)
279342
///
280343
/// This is intended as a convenience for code which *consumes* hashes, such
281344
/// as the implementation of a hash table or in unit tests that check
@@ -295,6 +358,11 @@ impl RandomState {
295358
}
296359
}
297360

361+
/// Creates an instance of RandomState using keys obtained from the random number generator.
362+
/// Each instance created in this way will have a unique set of keys. (But the resulting instance
363+
/// can be used to create many hashers each or which will have the same keys.)
364+
///
365+
/// This is the same as [RandomState::new()]
298366
impl Default for RandomState {
299367
#[inline]
300368
fn default() -> Self {
@@ -341,7 +409,37 @@ impl BuildHasher for RandomState {
341409
AHasher::from_random_state(self)
342410
}
343411

344-
/// Calculates the hash of a single value.
412+
413+
/// Calculates the hash of a single value. This provides a more convenient (and faster) way to obtain a hash:
414+
/// For example:
415+
#[cfg_attr(
416+
feature = "std",
417+
doc = r##" # Examples
418+
```
419+
use std::hash::BuildHasher;
420+
use ahash::RandomState;
421+
422+
let hash_builder = RandomState::new();
423+
let hash = hash_builder.hash_one("Some Data");
424+
```
425+
"##
426+
)]
427+
/// This is similar to:
428+
#[cfg_attr(
429+
feature = "std",
430+
doc = r##" # Examples
431+
```
432+
use std::hash::{BuildHasher, Hash, Hasher};
433+
use ahash::RandomState;
434+
435+
let hash_builder = RandomState::new();
436+
let mut hasher = hash_builder.build_hasher();
437+
"Some Data".hash(&mut hasher);
438+
let hash = hasher.finish();
439+
```
440+
"##
441+
)]
442+
/// (Note that these two ways to get a hash may not produce the same value for the same data)
345443
///
346444
/// This is intended as a convenience for code which *consumes* hashes, such
347445
/// as the implementation of a hash table or in unit tests that check

0 commit comments

Comments
 (0)