Skip to content

Add IAlternateEqualityComparer<ReadOnlySpan<byte>, string> support to StringComparer #106147

Open
@eiriktsarpalis

Description

@eiriktsarpalis

Motivation

This benchmark seems to suggest that building a string-based IAlternateEqualityComparer for ReadOnlySpan<byte> keys can have substantial performance and usability benefits over manually converting the key to an intermediate ReadOnlySpan<char>.

We should enhance StringComparer with an IAlternateEqualityComparer<ReadOnlySpan<byte>, string> implementation. I see a couple of potential approaches we could follow:

Approach 1: Factory method in System.Text.Encoding or UTF8Encoding

public partial class Encoding
{
    public virtual EncodedStringComparer GetEncodedStringComparer(StringComparer stringComparer);
}

// Essentially just an intersection type for the two interfaces
public abstract class EncodedStringComparer : IEqualityComparer<string>,
    IAlternateEqualityComparer<ReadOnlySpan<byte>, string>
{
    // Interface implementations
}

Which can then be used as follows:

EncodedStringComparer comparer = Encoding.UTF8.GetEncodedStringComparer(StringComparer.OrdinalIgnoreCase);
Dictionary<string, int> dictionary = new(comparer);
dictionary.GetAlternateLookup<ReadOnlySpan<byte>>(); // Success

Approach 2: hardcoding UTF-8 equality comparison into StringComparer

Title says it all, we could just make the assumption that UTF-8 is the encoding most people will end up using for ROS<byte> so we just bake it into StringComparer directly:

Dictionary<string, int> dictionary = new(StringComparer.Ordinal);
dictionary.GetAlternateLookup<ReadOnlySpan<byte>>(); // UTF-8 semantics whether you like it or not

cc @stephentoub @davidfowl

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions