-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for #1989 Replace modulo with fastmodulo in HashTable #55467
Conversation
This commit updates HashTable to use FastMod for faster hashes in 64bit.
Update to HashTable.cs and ThrowHelper.cs to use ThrowHelper where appropriate in HashTable.
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
Fixes renaming issues caused by bucketNumber.
Tagging subscribers to this area: @eiriktsarpalis Issue DetailsThis pull request contains two commits. First commit Replaces modulo with FastMod. The second commit updates thrown exceptions to use ThrowHelper methods.
|
@@ -730,6 +757,9 @@ private void rehash(int newsize) | |||
// 2) Protect against an OutOfMemoryException while allocating this | |||
// new bucket[]. | |||
bucket[] newBuckets = new bucket[newsize]; | |||
#if TARGET_64BIT | |||
_fastModMultiplier = HashHelpers.GetFastModMultiplier((uint)newsize); | |||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not safe. One of the key reasons code still uses Hashtable is its threading guarantees, which enable any number of reads to execute concurrently with one writer. By updating the _fastModMultiplifer here, concurrent readers could start to fail.
This corrects unsafe changes to _fastModMultiplier during rehashing.
@@ -258,7 +262,7 @@ public Hashtable(int capacity) : this(capacity, 1.0f) | |||
public Hashtable(int capacity, float loadFactor) | |||
{ | |||
if (capacity < 0) | |||
throw new ArgumentOutOfRangeException(nameof(capacity), SR.ArgumentOutOfRange_NeedNonNegNum); | |||
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with keeping nameof
as the argument?
@@ -267,13 +271,16 @@ public Hashtable(int capacity, float loadFactor) | |||
|
|||
double rawsize = capacity / _loadFactor; | |||
if (rawsize > int.MaxValue) | |||
throw new ArgumentException(SR.Arg_HTCapacityOverflow, nameof(capacity)); | |||
ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_HTCapacityOverflow, ExceptionArgument.capacity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would prefer it if we kept these enhancements in a separate PR from the fastmod changes.
Do we generally accept these types of enhancements for non-generic collections? I would assume we keep them around primarily for back-compat reasons so any change should require substantial justification. |
In general, that's true. That said, Hashtable is already much more expensive than Dictionary, and it's not clear to me that a) this will meaningfully move the needle on its performance, and b) that even if it does on microbenchmarks, it would meaningfully move the needle on the higher-level APIs that still need to consume it and for which perf could matter. @ErhanAtesoglu, thanks for your contribution here. Can you share an example of a type in the core libraries that still uses Hashtable and where this change measurably helps improve perf of that type? |
It would also be good if you could share microbenchmark results for HashTable too. Please see the dotnet/performance repo which has tests and excellent instructions for getting numbers with and without a change like this. |
With respect to whether to take changes to the legacy collections, another consideration is how much existing code uses them. We can easily gather data on that from Nuget if we wish. I'm going to guess it's considerable in which case I think we'd take a change if it have significant perf at minimal risk/churn. |
FWIW here are the instructions on how to benchmark your own branch: https://github.com/dotnet/performance/blob/main/src/benchmarks/micro/README.md#private-runtime-builds |
Just my quick thoughts. After testing it through the gauntlet I noticed I made some errors along the way. First I was using fastmod on longs which is not possible. Second with the uints I'm still getting test errors on what should be safe uses of FastMod. I have two acceptable lines where the replacement doesn't break any of the tests, but since similar lines do break the tests I'm not confident of these changes. My gut feeling says, leave HashTable alone. |
Ok. Thanks for your efforts here. I'm going to close this. |
This pull request contains two commits. First commit Replaces modulo with FastMod. The second commit updates thrown exceptions to use ThrowHelper methods.