Skip to content

Analyzer for BCL to prevent exposing non-struct GetEnumerator #95250

Open

Description

When collection types are added to the BCL, they usually implement IEnumerable<T> and expose a public GetEnumerator method. The return type of this method is important.

The expected pattern is to expose a public GetEnumerator, which the compiler will use when doing foreach, that returns a struct implementing IEnumerator<T>. This allows the collection to be enumerated with foreach without allocating/boxing an IEnumerator<T>. To satisfy the IEnumerable<T> implementation on the collection, explicit IEnumerable<T>.GetEnumerator() implementations are declared that can simply return a boxed struct enumerator.

Motivation

The API surface is not allowed to change in a breaking manner after being released, therefore it is crucial to design APIs in a sustainable and performant matter. An example that slipped through in .NET 6 is JsonObject.GetEnumerator which was declared in an unusual matter:

public IEnumerator<KeyValuePair<string, JsonNode?>> GetEnumerator();

This method is now forced to allocate, and the JIT also has a harder time optimizing it since it is an interface. It is unfortunate considering that the implementation was simply returning a List<T>.Enumerator, which could instead have been wrapped in a dedicated JsonObject.Enumerator struct.
It would not have been wise to expose List<T>.Enumerator directly because that would expose an implementation detail that should stay internal, which is also a point of consideration for the analyzer.

There is a bit of future-proof API in JsonObject though. The class implements .Keys and .Values of IDictionary<TKey, TValue> as explicitly implemented interface methods, so it's possible to expose dedicated classes in the future (in the spirit of Dictionary<TKey, TValue>.Key/ValueCollection), which in turn can expose struct enumerators which allows efficient enumeration of keys or values directly instead of key-value pairs.

Example

Here is an example of how JsonObject.GetEnumerator() could have been exposed differently:

- public IEnumerator<KeyValuePair<string, JsonNode?>> GetEnumerator() => Dictionary.GetEnumerator();

+ public Enumerator GetEnumerator() => new Enumerator(this);
+ IEnumerator<T> IEnumerable<T>.GetEnumerator() => GetEnumerator(); // calls the public method
+ IEnumerator IEnumerable.GetEnumerator => GetEnumerator(); // calls the public method
public struct Enumerator : IEnumerator<KeyValuePair<string, JsonNode?>>
{
    private List<KeyValuePair<string, JsonNode?>>.Enumerator _enumerator;
    
    public Enumerator(JsonObject obj) => _enumerator = obj.Dictionary.GetEnumerator();
    
    public KeyValuePair<string, JsonNode?> Current => _enumerator.Current;
    object IEnumerator.Current => Current; // calls the public getter
    
    public void Dispose() => _enumerator.Dispose();
    public bool MoveNext() => _enumerator.MoveNext();

    // Reset is hidden/explicitly implemented on List<T>.Enumerator
    public void Reset() => ((IEnumerator<KeyValuePair<string, JsonNode?>>) _enumerator).Reset();
}

Conclusion

To disallow these kinds of issues to slip into a release, I propose introducing an analyzer for the BCL to:

  • detect declarations of public IEnumerator<T> GetEnumerator() methods. It should hint into exposing a struct enumerator instead due to the aforementioned reasons, or an explicit interface implementation so a public GetEnumerator could be exposed later.

  • detect types implementing IDictionary<TKey, TValue> that expose public .Keys or .Values as interfaces instead of dedicated nested classes. It should hint into exposing a dedicated class, or an explicit interface implementation so public Keys/Values could be exposed later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions