Description
Problem statement
In high-throughput situations it's often desirable to minimise garbage. You often see advice like "don't use LINQ in hot code because it allocates a lot". One reason for this is that query comprehensions are translated to the query pattern using anonymous objects which live on the heap.
var q =
from x in xs
from y in ys
from z in zs
select x + y + z;
// translates to...
var q = xs
.SelectMany(x => ys, (x, y) => new { x, y }) // new { x, y } is a reference type that lives on the heap
.SelectMany(dummy => zs, (dummy, z) => dummy.x + dummy.y + z);
(Of course in practice dummy
will be a transparent identifier.) While dummy
will often be short-lived and won't survive the nursery, if your goal is to minimise garbage it's still preferable to avoid allocating it altogether.
You can achieve this by writing your query manually and storing intermediate variables in a custom value type. The example below will run with O(1) allocations:
var q = xs
.SelectMany(x => ys, (x, y) => new MyStruct(x, y))
.SelectMany(dummy => zs, (dummy, z) => dummy.x + dummy.y + z);
// or
var q =
from dummy in (
from x in xs
from y in ys
select new MyStruct(x, y)
)
from z in zs
select dummy.x + dummy.y + z;
struct MyStruct
{
public int x { get; }
public int y { get; }
public MyStruct(int x, int y)
{
this.x = x;
this.y = y;
}
}
When your query is long or complicated this translation gets rather tedious rather quickly (although C#7's new ValueTuple
certainly eases some of the pain). I'd like to be able to use the nice original query syntax but be confident that it won't allocate a lot at run time.
Proposed solution
My proposal is to (optionally) translate the original query into one which looks like the manually-written version, by generating MyStruct
at compile time, much like how anonymous objects already work.
It's not always desirable to use value types - it can be expensive to copy large value types around, and existing query providers may not understand expressions that don't use anonymous objects. So I propose having this behaviour disabled by default. Users can enable the value-type translation on a per-method level using an attribute:
[StructQueries] // all query comprehensions in this method will use an anonymous value type for their intermediate identifiers
public void MyMethod()
{
var q =
from x in xs
from y in ys
from z in zs
select x + y + zs;
}
// translates to...
[StructQueries]
public void MyMethod()
{
var q =
.SelectMany(x => ys, (x, y) => new <>AnonymousStruct0(x, y))
.SelectMany(dummy => zs, (dummy, z) => dummy.x + dummy.y + z);
}
[CompilerGenerated]
struct <>AnonymousStruct0
{
public int x { get; }
public int y { get; }
public MyStruct(int x, int y)
{
this.x = x;
this.y = y;
}
}