Specialize Dictionary<,> for string key #116045

EgorBo · 2025-05-27T22:05:21Z

A quick search on grep.app reveals 243 574 exact, case-sensitive matches for Dictionary<. Of these, 172 763 (about 71%) use String as the TKey type; many of the remaining dictionaries use value-types as keys.

Presumably, in most of these cases the default (ordinal) comparer is used, and per @GrabYourPitchforks dictionary rehashing when we switch to the randomized comparer is extremely rare. That suggests it would be worthwhile to specialize Dictionary<string, TValue> for the common ordinal-comparer scenario, a sort of hand-made PGO or generic specialization for ref types.

Under the hood, FindValue on a string-keyed dictionary is always a shared generic method (_Canon). The JIT is unable to devirtualize any call inside it (There was an issue logged about this, but I couldn’t locate it.)

The check I’m proposing is lightweight and should speed up lookups by roughly 50% when using the OrdinalComparer, while incurring only about a 5% slowdown for other reference-keyed dictionaries. With some extra efforts we can, in theory, limit the overhead to String keys only with help of IsKnownConstant(typeof(TKey)) && typeof(TKey) == typeof(string), but my attempt to do so wasn't pretty.

@stephentoub @jkotas @kg - is this something we think is worth doing or the regression is not acceptable?

Bonus idea: constant fold "string literal key".GetNonRandomizedHashCode() in JIT.

EgorBo · 2025-05-27T22:05:56Z

@EgorBot -amd -arm

using System;
using System.Collections.Generic;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

public class Bench
{
    static Dictionary<string, int> Dictionary_OC = Enumerable.Range(0, 1000)
        .ToDictionary(i => i.ToString(), i => i);

    static Dictionary<string, int> Dictionary_OCIC = Enumerable.Range(0, 1000)
        .ToDictionary(i => i.ToString(), i => i, StringComparer.OrdinalIgnoreCase);

    [Benchmark]
    [Arguments("1")]
    [Arguments("999")]
    public int Lookup_best(string key) => Dictionary_OC[key];

    [Benchmark]
    [Arguments("1000000")]
    public bool Contains_best(string key) => Dictionary_OC.ContainsKey(key);


    [Benchmark]
    [Arguments("1")]
    [Arguments("999")]
    public int Lookup_worst(string key) => Dictionary_OCIC[key];

    [Benchmark]
    [Arguments("1000000")]
    public bool Contains_worst(string key) => Dictionary_OCIC.ContainsKey(key);
}

dotnet-policy-service · 2025-05-27T22:13:15Z

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

stephentoub · 2025-05-27T22:14:07Z

while incurring only about a 5% slowdown for other reference-keyed dictionaries.

While I'm certainly excited at the idea of a significant improvement for the 70% case, a 5% regression for other reference-type keys seems problematically large. We also have other similar types, like HashSet, ConcurrentDictionary, etc., and it'd be unfortunate if this was constrained to only Dictionary.

kg · 2025-05-27T22:18:29Z

while incurring only about a 5% slowdown for other reference-keyed dictionaries.

While I'm certainly excited at the idea of a significant improvement for the 70% case, a 5% regression for other reference-type keys seems problematically large. We also have other similar types, like HashSet, ConcurrentDictionary, etc., and it'd be unfortunate if this was constrained to only Dictionary.

Agreed. I love the improvement for string keys but 5% for everything else feels like a hard sell. I guess for struct-typed keys we won't be paying the cost, right, since the JIT specializes code for those instead of using Canon? That would mean users of record structs and tuples neither benefit from faster string key lookups nor are penalized by this. Users of record classes would pay the penalty but not benefit, yeah?

Does this add overhead for custom comparers or would those be a way to bypass the overhead here?

EgorBo · 2025-05-27T22:24:04Z

5% regression for other reference-type keys seems problematically large.

Does this add overhead for custom comparers or would those be a way to bypass the overhead here?

So with some acrobatics in JIT, I can limit the overhead to String TKey only (when comparer is not the default one e.g. OrdinalComparerIgnoreCase), presumably it will be even less than 5% (e.g. 3-4%) due to slow HashCode/Equals anyway.

Would that be acceptable?

danmoseley · 2025-05-27T22:38:59Z

What is the remaining cost for the non default string comparer - just the check for it?

But, I'd be inclined to say this is worth it based on the counter factual argument: if we already had such performance we would not reverse it to get a few % in the non default case.

Btw can grep.app be used to guess at how common default comparer is?

jkotas · 2025-05-27T22:40:51Z

speed up lookups by roughly 50%

Are the key lengths used by your benchmark representative of typical real-world use case?

My guess is that the string lengths for typical real-world use cases are quite a bit longer. We may want to use XxHash3 for non-randomized comparer to optimize those cases (see #98838 for the discussion - IIRC, it was motivated by analyzing real workload).

EgorBo · 2025-05-27T23:05:52Z

What is the remaining cost for the non default string comparer - just the check for it?

Yep, presumably it's just two asm instructions (on x64 at least)

Btw can grep.app be used to guess at how common default comparer is?

Not sure, I guess CodeQL might be a better option here

Are the key lengths used by EgorBot/runtime-utils#360 (comment) representative of typical real-world use case?

@jkotas, Can't easily tell, my analysis on some 1P implied they're rather small than large (e.g. not more than 50 chars).
From quick experiments, it seems XxHash32 shines only on large inputs, e.g. check this bechmark. The cutoff is around 40-50 chars (it's a lot lower if we compare against the randomized Marvin)

That's why I tried to optimize the shared generics/virtual call overhead rather than the hash function. Although, I think it won't hurt to re-use some parts of XxHash3 in corelib for long inputs, I've made an attempt a while ago: EgorBo@8ab3135

kg · 2025-05-27T23:23:56Z

5% regression for other reference-type keys seems problematically large.

Does this add overhead for custom comparers or would those be a way to bypass the overhead here?

So with some acrobatics in JIT, I can limit the overhead to String TKey only (when comparer is not the default one e.g. OrdinalComparerIgnoreCase), presumably it will be even less than 5% (e.g. 3-4%) due to slow HashCode/Equals anyway.

Would that be acceptable?

IMO if it's string tkey only that sounds like a reasonable tradeoff. People who care about perf there are using the ordinal comparer, and if they're using something else it's probably for business requirements (case insensitive lookups, etc) where the requirements are more important than being blazing fast.

I do think it's probably worth doing tests with longer string keys like jkotas pointed out though. I would expect the average key length is much higher than 4. Xml tag names, people's full names, filesystem paths, stringified enum values, language names, etc.

stephentoub · 2025-05-28T00:09:23Z

But, I'd be inclined to say this is worth it based on the counter factual argument

There's a difference between making tradeoffs in something that's brand new and making tradeoffs in something that folks have been building on top of for > 20 years.

danmoseley · 2025-05-28T00:21:21Z

But, I'd be inclined to say this is worth it based on the counter factual argument

There's a difference between making tradeoffs in something that's brand new and making tradeoffs in something that folks have been building on top of for > 20 years.

Indeed, a few % could put someone out of SLA, that's why i was only inclined..

EgorBo · 2025-05-28T00:38:24Z

Alternative solution is to inline FindValue when its callsite is not a shared generic (e.g. when indexer is inlined). Hence, FindValue(_Canon) becomes specialized as is.

Instead of AggressiveInlining, we can hack in JIT to be more precise (and check profile weight).
I've kicked off a benchmark here (with long strings): EgorBot/runtime-utils#360 (comment)

AndyAyersMS · 2025-05-28T01:02:43Z

Under the hood, FindValue on a string-keyed dictionary is always a shared generic method (_Canon). The JIT is unable to devirtualize any call inside it (There was an issue logged about this, but I couldn’t locate it.)

Feel free to open a new issue, I think this is the shared-shared case where we don't even add probes today. See eg #108913 (comment) .

It's on my list of things to try and squeeze into .NET 10.

EgorBo · 2025-05-28T02:31:16Z

Feel free to open a new issue, I think this is the shared-shared case where we don't even add probes today. See eg #108913 (comment) .

It's on my list of things to try and squeeze into .NET 10.

Yep, it is that one. I've just filed #116055

EgorBo · 2025-05-28T02:36:32Z

Looks like inlining may already happen naturally with help of PGO even today, I've filed a small BDN-centric improvement for the inliner #116054 that significantly improved lookup in the benchmark: EgorBot/runtime-utils#362

stephentoub · 2025-05-28T02:42:04Z

src/libraries/System.Private.CoreLib/src/System/Collections/Generic/Dictionary.cs

@@ -394,6 +394,7 @@ public virtual void GetObjectData(SerializationInfo info, StreamingContext conte
            }
        }

+        [MethodImpl(MethodImplOptions.AggressiveInlining)]


Won't this frequently result in this whole method being inlined into consumer call sites of methods like Dictionary.ContainsKey? That seems bad.

Yeah, I was not suggesting this as the real fix as I said in #116045 (comment), just wanted to run benchmarks against it, but then I noticed that we already can inline it even without this attribute. I still have a small idea I want to try here (as a workaround for #116055)

github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 27, 2025

dotnet-policy-service bot assigned EgorBo May 27, 2025

EgorBot mentioned this pull request May 27, 2025

Benchmarks for #116045 (EgorBo) EgorBot/runtime-utils#360

Open

jkotas added area-System.Collections and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels May 27, 2025

Mark FindValue as aggressive-inlining

3c7b67a

EgorBo force-pushed the faster-findvalue-string branch from 3332a2a to 3c7b67a Compare May 28, 2025 00:34

stephentoub reviewed May 28, 2025

View reviewed changes

build-analysis bot mentioned this pull request May 28, 2025

System.Net.Quic.Tests.MsQuicTests.WriteTests failed with "System.Net.Quic.QuicException : The connection timed out from inactivity." #105177

Open

Specialize Dictionary<,> for string key #116045

Are you sure you want to change the base?

Specialize Dictionary<,> for string key #116045

Uh oh!

Conversation

EgorBo commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EgorBo commented May 27, 2025

Uh oh!

dotnet-policy-service bot commented May 27, 2025

Uh oh!

stephentoub commented May 27, 2025

Uh oh!

kg commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EgorBo commented May 27, 2025

Uh oh!

danmoseley commented May 27, 2025

Uh oh!

jkotas commented May 27, 2025

Uh oh!

EgorBo commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kg commented May 27, 2025

Uh oh!

stephentoub commented May 28, 2025

Uh oh!

danmoseley commented May 28, 2025

Uh oh!

EgorBo commented May 28, 2025

Uh oh!

AndyAyersMS commented May 28, 2025

Uh oh!

EgorBo commented May 28, 2025

Uh oh!

EgorBo commented May 28, 2025

Uh oh!

stephentoub May 28, 2025

Choose a reason for hiding this comment

Uh oh!

EgorBo May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EgorBo commented May 27, 2025 •

edited

Loading

kg commented May 27, 2025 •

edited

Loading

EgorBo commented May 27, 2025 •

edited

Loading