-
Notifications
You must be signed in to change notification settings - Fork 37
Description
string-interner uses hashbrown with its default-hasher feature, which causes it to use foldhash's foldhash::fast::RandomState to hash strings.
As the name implies, RandomState relies on a one time source of randomness (stack address, allocation, current time, etc) to initialize a global variable which is used to create all subsequent hashes; this means that hashes will vary from run to run, which gives a small measure of DOS resistance.
Normally this is not observable via using string-interner. But, when using string-interner in combination with a runtime code loading approach such as hot-lib-reloader, string-interner will break:
- Create a string-interner and intern some strings via get_or_intern()
- foldhash will set its global variable and calculate hashes which string-interner's hashbrown hashmap will use.
- Load (or reload) some code that uses string-interner
- In that loaded code, try to get() a string that has already been interned
- foldhash will again set its global variable (because the (re)loaded code has its own copy of that global variable) and will return different hashes, which means that string-interner will fail to find the string that has already been loaded.
The workaround/fix is straightforward:
pub type Symbol = string_interner::symbol::SymbolU32;
// non-deterministic hashing, broken with hot code reloading
pub type AllSymbols = string_interner::StringInterner<
string_interner::backend::BucketBackend<Symbol>
>;
// (implicitly using foldhash::fast::RandomState as the hasher)
// deterministic hashing, works with hot code reloading
pub type AllSymbols = string_interner::StringInterner<
string_interner::backend::BucketBackend<Symbol>,
foldhash::fast::FixedState,
>;but actually figuring out what is going wrong is a bit painful.
Now, one could argue that this is a docs issue upstream: if hashbrown had documented that the hashes it uses are non-deterministic when its default-hasher feature is enabled, I might have figured the issue out a little faster.
But I think there's also a case to be made that actually foldhash::fast::FixedState (or some other deterministic hash) is the right default hasher to be using in string-interner, rather than foldhash::fast::RandomState: the minimal DOS resistance you get from RandomState does not seem to be relevant for the case of string interning.