Skip to content

Add IAlternateEqualityComparer<ReadOnlySpan<byte>, string> support to StringComparer #106147

Open

Description

Motivation

This benchmark seems to suggest that building a string-based IAlternateEqualityComparer for ReadOnlySpan<byte> keys can have substantial performance and usability benefits over manually converting the key to an intermediate ReadOnlySpan<char>.

We should enhance StringComparer with an IAlternateEqualityComparer<ReadOnlySpan<byte>, string> implementation. I see a couple of potential approaches we could follow:

Approach 1: Factory method in System.Text.Encoding or UTF8Encoding

public partial class Encoding
{
    public virtual EncodedStringComparer GetEncodedStringComparer(StringComparer stringComparer);
}

// Essentially just an intersection type for the two interfaces
public abstract class EncodedStringComparer : IEqualityComparer<string>,
    IAlternateEqualityComparer<ReadOnlySpan<byte>, string>
{
    // Interface implementations
}

Which can then be used as follows:

EncodedStringComparer comparer = Encoding.UTF8.GetEncodedStringComparer(StringComparer.OrdinalIgnoreCase);
Dictionary<string, int> dictionary = new(comparer);
dictionary.GetAlternateLookup<ReadOnlySpan<byte>>(); // Success

Approach 2: hardcoding UTF-8 equality comparison into StringComparer

Title says it all, we could just make the assumption that UTF-8 is the encoding most people will end up using for ROS<byte> so we just bake it into StringComparer directly:

Dictionary<string, int> dictionary = new(StringComparer.Ordinal);
dictionary.GetAlternateLookup<ReadOnlySpan<byte>>(); // UTF-8 semantics whether you like it or not

cc @stephentoub @davidfowl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions