Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced some Dictionaries and Hashtables with Hashsets #4177

Conversation

ThomasGoulet73
Copy link
Contributor

I replaced some Dictionaries and Hashtables that only use the key as a way to filter out duplicates. I replaced them with HashSet to improve performance.

I used the following code to benchmark it :
https://gist.github.com/ThomasGoulet73/99559a6cd49a04dc70a413e9668eee22

Here are the results (First row is 10 items, second is 1000 and third is 100000) :

Dictionary vs HashSet:

Method enumerable Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
Dictionary Syste(...)ring] [48] 742.2 ns 14.71 ns 13.76 ns 1.00 0.00 0.2270 0.0010 - 1.39 KB
Dictionary Syste(...)ring] [48] 60,340.7 ns 231.42 ns 216.47 ns 81.32 1.42 18.8599 4.6997 - 116.13 KB
Dictionary Syste(...)ring] [48] 6,990,405.3 ns 114,642.24 ns 101,627.36 ns 9,405.37 234.34 796.8750 750.0000 742.1875 10303.12 KB
HashSet Syste(...)ring] [48] 558.8 ns 1.55 ns 1.45 ns 0.75 0.01 0.1907 - - 1.17 KB
HashSet Syste(...)ring] [48] 47,343.1 ns 173.14 ns 161.96 ns 63.81 1.22 14.2822 2.8076 - 87.76 KB
HashSet Syste(...)ring] [48] 5,982,708.5 ns 80,465.49 ns 75,267.47 ns 8,062.82 161.97 703.1250 648.4375 648.4375 7945.69 KB

Hashtable vs HashSet:

Method enumerable Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
HashHashtable Syste(...)ring] [48] 680.6 ns 3.88 ns 3.44 ns 1.00 0.00 0.1326 - - 832 B
HashHashtable Syste(...)ring] [48] 94,554.4 ns 477.01 ns 372.42 ns 138.79 0.76 13.7939 3.4180 - 87472 B
HashHashtable Syste(...)ring] [48] 13,670,183.4 ns 145,574.93 ns 129,048.39 ns 20,086.21 188.35 640.6250 640.6250 640.6250 7244564 B
HashSet Syste(...)ring] [48] 379.2 ns 2.95 ns 2.76 ns 0.56 0.00 0.1287 0.0005 - 808 B
HashSet Syste(...)ring] [48] 36,610.4 ns 152.59 ns 142.74 ns 53.79 0.40 11.5967 2.8687 - 73200 B
HashSet Syste(...)ring] [48] 4,631,696.9 ns 63,645.40 ns 53,146.79 ns 6,801.30 90.29 562.5000 523.4375 523.4375 6038279 B

I think the perf gain is good enough to consider this PR.

Thank you!

@ThomasGoulet73 ThomasGoulet73 requested a review from a team as a code owner February 12, 2021 23:14
@ghost ghost added the PR metadata: Label to tag PRs, to facilitate with triage label Feb 12, 2021
@ghost ghost requested review from fabiant3, ryalanms and SamBent February 12, 2021 23:14
@ThomasGoulet73
Copy link
Contributor Author

Hey @ryalanms, the CI failed and it doesn't seem to be related to my changes. When you have a few minutes, could you run it again ? I don't think external contributors have the permissions to do it.

Thank you!

@ryalanms
Copy link
Member

ryalanms commented Feb 16, 2021

@ThomasGoulet73: Yes, I'll fix this today. A VS version update on the build pool machines can cause a compiler version mismatch with the internal WPF build. The internal WPF build needs to be triggered.

@ryalanms
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ryalanms ryalanms added Performance Performance related issue Queue for test pass labels Feb 17, 2021
Base automatically changed from master to main March 17, 2021 17:38
@ryalanms
Copy link
Member

This and @wjk's changes had to be tested separately from the others in the community test pass.

@ryalanms ryalanms merged commit 52dbcf7 into dotnet:main Mar 27, 2021
@ThomasGoulet73
Copy link
Contributor Author

Thank you @ryalanms!

@ThomasGoulet73 ThomasGoulet73 deleted the replace-dictionary-and-hashtable-to-hashset branch April 8, 2021 22:01
@ghost ghost locked as resolved and limited conversation to collaborators Apr 9, 2022
@ghost ghost assigned ThomasGoulet73 May 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Performance Performance related issue PR metadata: Label to tag PRs, to facilitate with triage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants