Skip to content

API Proposal: Add type to avoid boxing .NET intrinsic types #28882

@JeremyKuhne

Description

@JeremyKuhne

Background and Motivation

Currently there is no way to pass around a heterogeneous set of .NET value types without boxing them into objects or creating a custom wrapper struct. To facilitate low allocation exchange of value types we should provide a struct that allows passing the most common value types without boxing and still allows storing within other types (including arrays) or on the heap when needed.

ASP.NET and Azure SDK have both expressed a need for this functionality for scenarios such as logging.

This following is an evolved proposal based on feedback from various sources. The original proposal is included below. Key changes were to make this a smaller type, alignment with object semantics, support of any object, and more focused non-boxing support.

Proposed API

public readonly struct Value
{
    public Value(object? value);
    public Value(byte value);
    public Value(byte? value);
    public Value(sbyte value);
    public Value(sbyte? value);
    public Value(bool value);
    public Value(bool? value);
    public Value(char value);
    public Value(char? value);
    public Value(short value);
    public Value(short? value);
    public Value(int value);
    public Value(int? value);
    public Value(long value);
    public Value(long? value);
    public Value(ushort value);
    public Value(ushort? value);
    public Value(uint value);
    public Value(uint? value);
    public Value(ulong value);
    public Value(ulong? value);
    public Value(float value);
    public Value(float? value);
    public Value(double value);
    public Value(double? value);
    public Value(DateTimeOffset value);         // Boxes with offsets that don't fall between 1800 and 2250
    public Value(DateTimeOffset? value);        // Boxes with offsets that don't fall between 1800 and 2250
    public Value(DateTime value);
    public Value(DateTime? value);
    public Value(ArraySegment<byte> segment);
    public Value(ArraySegment<char> segment);
    // No decimal as it always boxes


    public Type? Type { get; }                      // Type or null if the Value represents null
    public static Value Create<T>(T value);
    public unsafe bool TryGetValue<T>(out T value); // Fastest extraction
    public T As<T>();                               // Throws InvalidCastException if not supported

    // For each type of constructor except `object`:
    public static implicit operator Value(int value) => new(value);
    public static explicit operator int(in Value value) => value.As<int>();
}

Fully working prototype

Usage Examples

public static void Foo(Value value)
{
    Type? type = value.Type;
    if (type == typeof(int))
    {
        int @int = value.As<int>();

        // Any casts that would work with object work with Value

        int? nullable = value.As<int?>();

        object o = value.As<object>();
    }

    if (value.TryGetValue(out long @long))
    {
        // TryGetValue follows the same casting rules as "As"
    }

    // Enums are not boxed if passed through the Create method
    Value dayValue = Value.Create(DayOfWeek.Friday);

    // Does not box (until Now is > 2250)
    Value localTime = DateTimeOffset.Now;
    localTime = Value.Create(DateTimeOffset.Now);
    localTime = new(DateTimeOffset.Now);

    // ArraySegment<char> and ArraySegment<byte> are supported
    Value segment = new ArraySegment<byte>(new byte[2]);

    // Any type can go into value, however. Unsupported types will box as they do with object.
    Value otherSegment = new(new ArraySegment<int>(new int[1]));
}

Details

Goals

  • Takes any value
  • Can be stored anywhere
  • Follows object semantics (can't box a nullable, for example)
  • Type is 128 bits on 64bit platforms
  • Internal data is opaque
  • Does not box intrinsics (outside of decimal)
  • Does not box nullable intrinsics
  • Does not box DateTime or most DateTimeOffset values (1800-2250 for local times supported)

Other benefits

  • As is can be implemented out-of-box

Other Possible Names

  • Variant
  • ValueObject
  • ???
Original Proposal Currently there is no way to pass around a heterogeneous set of .NET value types without boxing them into objects or creating a custom wrapper struct. To facilitate low allocation exchange of value types we should provide a struct that allows passing the information without heap allocations. The canonical example of where this would be useful is in `String.Format`.

Related proposals and sample PRs

Goals

  1. Support intrinsic value types (int, float, etc.)
  2. Support most common value types used in formatting (DateTime)
  3. Have high performance
  4. Balance struct size against type usage frequency
  5. Facilitate "raw" removal of value type data (you want to force cast to int, fine)
  6. Provide a mechanism for passing a small collection of Variants via the stack
  7. Allow all types by falling back to boxing
  8. Support low allocation interpolated strings

Non Goals

  1. Support all value types without boxing
  2. Make it work as well on .NET Framework as it does on Core (presuming it's possible in the final design)

Nice to Have

  1. Usable on .NET Framework (currently does)

General Approach

Variant is a struct that contains an object pointer and a "union" struct that allows stashing of arbitrary blittable (i.e. where unmanaged) value types that are within a specific size constraint.

Sample Usage

// Consuming method
public void Foo(ReadOnlySpan<Variant> data)
{
     foreach (Variant item in data)
     {
         switch (item.Type)
         {
             case VariantType.Int32:
             //   ...
         }
     }
}

// Calling method
public void Bar()
{
     var data = Variant.Create(42, true, "Wow");
     Foo(data.ToSpan());

     // Only needed if running on .NET Framework
     data.KeepAlive();
}

Surface Area

namespace System
{
    /// <summary>
    /// <see cref="Variant"/> is a wrapper that avoids boxing common value types.
    /// </summary>
    public readonly struct Variant
    {
        public readonly VariantType Type;

        /// <summary>
        /// Get the value as an object if the value is stored as an object.
        /// </summary>
        /// <param name="value">The value, if an object, or null.</param>
        /// <returns>True if the value is actually an object.</returns>
        public bool TryGetValue(out object value);

        /// <summary>
        /// Get the value as the requested type <typeparamref name="T"/> if actually stored as that type.
        /// </summary>
        /// <param name="value">The value if stored as (T), or default.</param>
        /// <returns>True if the <see cref="Variant"/> is of the requested type.</returns>
        public unsafe bool TryGetValue<T>(out T value) where T : unmanaged;

        // We have explicit constructors for each of the supported types for performance
        // and to restrict Variant to "safe" types. Allowing any struct that would fit
        // into the Union would expose users to issues where bad struct state could cause
        // hard failures like buffer overruns etc.
        public Variant(bool value);
        public Variant(byte value); 
        public Variant(sbyte value);
        public Variant(short value);
        public Variant(ushort value);
        public Variant(int value);
        public Variant(uint value);
        public Variant(long value);
        public Variant(ulong value);
        public Variant(float value);
        public Variant(double value);
        public Variant(decimal value);
        public Variant(DateTime value);
        public Variant(DateTimeOffset value);
        public Variant(Guid value);
        public Variant(object value);

        /// <summary>
        /// Get the value as an object, boxing if necessary.
        /// </summary>
        public object Box();

        // Idea is that you can cast to whatever supported type you want if you're explicit.
        // Worst case is you get default or nonsense values.

        public static explicit operator bool(in Variant variant);
        public static explicit operator byte(in Variant variant);
        public static explicit operator char(in Variant variant);
        public static explicit operator DateTime(in Variant variant);
        public static explicit operator DateTimeOffset(in Variant variant);
        public static explicit operator decimal(in Variant variant);
        public static explicit operator double(in Variant variant);
        public static explicit operator Guid(in Variant variant);
        public static explicit operator short(in Variant variant);
        public static explicit operator int(in Variant variant);
        public static explicit operator long(in Variant variant);
        public static explicit operator sbyte(in Variant variant);
        public static explicit operator float(in Variant variant);
        public static explicit operator TimeSpan(in Variant variant);
        public static explicit operator ushort(in Variant variant);
        public static explicit operator uint(in Variant variant);
        public static explicit operator ulong(in Variant variant);

        public static implicit operator Variant(bool value);
        public static implicit operator Variant(byte value);
        public static implicit operator Variant(char value);
        public static implicit operator Variant(DateTime value);
        public static implicit operator Variant(DateTimeOffset value);
        public static implicit operator Variant(decimal value);
        public static implicit operator Variant(double value);
        public static implicit operator Variant(Guid value);
        public static implicit operator Variant(short value);
        public static implicit operator Variant(int value);
        public static implicit operator Variant(long value);
        public static implicit operator Variant(sbyte value);
        public static implicit operator Variant(float value);
        public static implicit operator Variant(TimeSpan value);
        public static implicit operator Variant(ushort value);
        public static implicit operator Variant(uint value);
        public static implicit operator Variant(ulong value);

        // Common object types
        public static implicit operator Variant(string value);

        public static Variant Create(in Variant variant) => variant;
        public static Variant2 Create(in Variant first, in Variant second) => new Variant2(in first, in second);
        public static Variant3 Create(in Variant first, in Variant second, in Variant third) => new Variant3(in first, in second, in third);
    }

    // Here we could use values where we leverage bit flags to categorize quickly (such as integer values, floating point, etc.)
    public enum VariantType
    {
        Object,
        Byte,
        SByte,
        Char,
        Boolean,
        Int16,
        UInt16,
        Int32,
        UInt32,
        Int64,
        UInt64,
        DateTime,
        DateTimeOffset,
        TimeSpan,
        Single,
        Double,
        Decimal,
        Guid
    }

    // This is an "advanced" pattern we can use to create stack based spans of Variant. Would also create at least a Variant3.
    public readonly struct Variant2
    {
        public readonly Variant First;
        public readonly Variant Second;

        public Variant2(in Variant first, in Variant second);

        // This is for keeping objects rooted on .NET Framework once turned into a Span (similar to GC.KeepAlive(), but avoiding boxing).
        [MethodImpl(MethodImplOptions.NoInlining)]        
        public void KeepAlive();

        public ReadOnlySpan<Variant> ToSpan();
    }
}

FAQ

Why "Variant"?

  • It does perform a function "similar" to OLE/COM Variant so the term "fits". Other name suggestions are welcome.

Why isn't Variant a ref struct?

  • Primarily because you can't create a Span of ref structs.
  • We also want to give the ability to store arrays of these on the heap when needed

What about variadic argument support (__arglist, ArgIterator, etc.)?

  • Short answer: not sufficient. Referred to as "Vararg" in the CLI specification, the current implemenation is primarily for C++/CLI. It isn't supported on Core yet and would require significant investment to support scenarios here reliably and to support non-Windows environments. This would put any solution based on this way out and may make down level support impossible.

What about TypedReference and __makeref, etc.?

  • TypedReference is a ref struct (see above). Variant gives us more implementation flexibility, doesn't rely on undocumented keywords, and is actually faster. (Simple test of wrapping/unwrapping an int it is roughly 10-12% faster depending on inlining.)

Why not support anything that fits?

  • We could in theory, but there would be safety concerns with getting the data back out. To support high performance usage we want to allow hard casts of value data.

How about enums?

  • This one may be worth it and is technically doable. Still investigating...

cc: @jaredpar, @vancem, @danmosemsft, @jkotas, @davidwrighton, @stephentoub

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-needs-workAPI needs work before it is approved, it is NOT ready for implementationarea-System.Runtime

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions