-
-
Notifications
You must be signed in to change notification settings - Fork 51
Vectors
FlatBuffers supports vectors (lists) through the general syntax of:
table SomeTable
{
SomeVector:[SomeType];
}
FlatSharp supports the following types for Vectors:
-
IList<T>
/IReadOnlyList<T>
T[]
-
Memory<byte>
/ReadOnlyMemory<byte>
This page attempts to provide some detail on when it is appropriate to choose each type of vector.
The general guidance, however, is to use IList
and IReadOnlyList
, which provide a balance between performance and the principle of least surprise.
However, there are always exceptions:
- You need Key/Value lookups. In this case, refer to Indexed Vectors.
- You need to map a chunk of the input buffer as raw bytes without copying (perhaps a nested flat buffer or a large vector). In this case, use a
Memory<byte>
vector. This will generally point to a location in the input buffer without any copies. - The vectors are known to be small and/or fast access is very important. In these cases, an Array may be useful.
This discussion on the rest of the page is heavily related to the concept of Deserialization Modes. The reader is assumed to have digested the contents of that article before reading the rest of this one.
By virtue of interfaces, IList<T>
and IReadOnlyList<T>
give FlatSharp lots of flexibility to satisfy the requested deserialization option in a non-surprising way. FlatSharp provides an implementation of IList<T>
that sits directly on top of an IInputBuffer
and allows lazy index-based access.
Deserialization Mode | Behavior | Actual Type |
---|---|---|
Lazy |
Elements are instantiated on-demand. A new T() is created for each element accessed. |
FlatBufferList<T> |
PropertyCache |
Elements are instantiated on-demand. A new T() is created for each element accessed. This is to avoid large array allocations. |
FlatBufferList<T> |
VectorCache |
New vector is allocated and initialized the first time the Vector property is accessed. The elements of the array access their data according to VectorCache rules. |
ReadOnlyCollection<T> |
VectorCacheMutable |
Same as VectorCache . |
List<T> |
Greedy |
New Array allocated and recursively initialized at deserialization time. The elements of the array are greedily initialized. | ReadOnlyCollection<T> |
GreedyMutable |
Same as Greedy . |
List<T> |
Lists are great choices for nearly all scenarios, match developer expections about deserialization behavior, and are fast-enough for most cases. Unless the buffer contains binary vectors or small vectors that need very fast access, it is recommended to use lists.
Arrays are the simplest vector type. However, Arrays do not allow FlatSharp to be lazy about initialization since they cannot be subclassed. This leads to FlatSharp producing sometimes-surprising behavior when using arrays. Deserializations on arrays always "greedy" on the array itself, though array elements will obey the Deserialization Mode about how lazy/greedy to be.
Deserialization Mode | Behavior |
---|---|
Lazy |
New Array allocated and fully initialized each time the Vector property is accessed. The elements of the array access their data lazily. |
PropertyCache |
New Array allocated and initialized the first time the Vector property is accessed. The elements of the array access their data according to PropertyCache rules. |
VectorCache |
New Array allocated and initialized the first time the Vector property is accessed. The elements of the array access their data according to VectorCache rules. |
Greedy |
New Array allocated and recursively initialized at deserialization time. The elements of the array are greedily initialized. |
When to consider Arrays:
- You need fastest possible access to a vector of data. The CLR overhead for accessing array members is nearly nothing. Be sure and measure the overhead of
IList
first. - When using
Greedy
deserialization. The vector is going to be allocated no matter what in this mode, so arrays can make sense.
When not to use Arrays:
- You are using the
Lazy
deserialization option. This will force a new array allocation each time the vector is accessed through the table. - The vectors are very large. Arrays must be allocated all at once and can't be initialized lazily like
IList
.
Other Notes:
Arrays in .NET are mutable. However, any changes will not be reflected back into the source buffer. When using Deserialization Modes other than Lazy
, array mutations will be visible to other readers (but not written to the buffer).
FlatSharp exposes a final kind of vector: Memory<byte>
and ReadOnlyMemory<byte>
. These two are special because they allow returning a reference into the IInputBuffer
used to deserialize the original object.
table Packet {
Source:string;
Destination:string;
MessageKind:string;
NestedFlatBuffer:[ubyte] (fs_vector:"Memory");
}
When accessing packet.NestedFlatBuffer
, the Memory<byte>
that comes back will reference into the original input buffer.
byte[] message = ...;
var parsed = FlatBufferSerializer.Parse<Packet>(message);
var payload = FlatBufferSerializer.Parse<SomePayload>(parsed.NestedFlatBuffer); // this points into the original "message" array. No copies necessary!
When to consider Memory:
Your vector carries large binary payloads, such as files, compressed data, images, or nested FlatBuffers that you wish to avoid copying. Memory provides the ultimate in efficiency by pointing into the original buffer.
When to avoid Memory:
Because Memory<byte>
is a pointer into the original input buffer, any modifications made will be written back to the original input buffer. This can be great for some scenarios, but may lead to erroneous behavior if the developer is unaware of this quirk. Using Greedy
deserialization will create a copy of the input data.