Skip to content

[Proposal]: ReadOnlySpan initialization from static data #5295

@stephentoub

Description

@stephentoub

ReadOnlySpan initialization from static data

  • Proposed
  • Prototype: Not Started
  • Implementation: Not Started
  • Specification: Not Started

Summary

Provide a syntax for initializing a ReadOnlySpan<T> from constant data and with guaranteed zero allocation.

Motivation

dotnet/roslyn#24621 added compiler support that translates:

ReadOnlySpan<byte> data = new byte[] { const, values };

into non-allocating code that blits the binary data into the assembly data section and creates a span that points directly to that data. The same optimization is done for:

static readonly ReadOnlySpan<byte> Data => new byte[] { const, values };

We now rely on this all over the place in dotnet/runtime and elsewhere, as it provides a very efficient means for accessing a collection of constant values with minimal overhead and in a way the JIT is able to optimize consumption of very well.

However, there are multiple problems with this:

  1. The optimization only applies to byte-sized primitive T values, namely byte, sbyte, and bool. Specify any other type, and you fall off a massive cliff, as code you were hoping to be allocation-free now allocates a new array on every access.
  2. It's easy to accidentally fall off a similar cliff if at least one of the values turns out to be non-const (or becomes non-const), e.g. if a const value referred to in the initialization is changed elsewhere from a const to a static readonly.
  3. The syntax is confusing, as it looks like it's allocating, and PRs that optimize code from:
private static readonly byte[] s_bytes = new byte[] { ... };

to

private static ReadOnlySpan<byte> Bytes => new byte[] { ... };

are often met with confusion and misinformation.

Detailed design

Add dedicated syntax for creating spans without allocating that:

  1. Doesn't visually look like it's allocating (i.e. avoid use of 'new').
  2. Provides validation with errors if the data isn't provably constant.

As the following syntax fails to compile today:

ReadOnlySpan<byte> data = { 1, 2, 3 };

and

static ReadOnlySpan<byte> Data => { 1, 2, 3 };

they could be co-opted for this purpose.

Opening up this syntax via the removal of new T[] doesn't prevent the optimization from being applied by the compiler when new T[] is used, but it would guarantee a non-allocating implementation when the new T[] isn't used.

This could also tie in with params Span<T>: the syntax for the local variant could blit the data into the assembly if possible, or else fall back to the same implementation it would use for a params Span<T> method argument (assuming that params syntax itself doesn't itself fall back to heap allocation).

Implementation-wise, the compiler would use RVA statics whenever possible and fall back to static readonly arrays otherwise.

Drawbacks

TBD

Alternatives

TBD

Unresolved questions

Related to this, we have prototyped in both the runtime and the C# compiler support for extending this optimization to more than just byte-sized primitives. The difficulty with other primitives is endianness, and it can be addressed in a manner similar to array initialization: the runtime exposes a helper that either returns the original pointer, or sets up a cache of a copy of the data reversed based on the current endianness. There are also multiple fallback code generation options available for if that API isn't available. Such improvements to the compiler are related but separate from the improvements to the language syntax for this existing optimization.

Design meetings

cc: @jaredpar

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions