Description
Background and motivation
Currently, AI libraries in the .NET ecosystem, e.g. OpenAI, Azure AI Search, use ReadOnlyMemory<float>
to represent embedding vectors. However, embeddings can be of narrower types such as int8
, int16
, float16
, etc., which consume less memory, providing both cost and performance benefits. This proposal aims to introduce a versatile container for embeddings that can handle various data types, enabling more efficient memory usage and broader interoperability among different services (e.g., retrieving vectors from services like OpenAI and storing them in vector databases like Azure Search).
API Proposal
// package: ?
namespace System.Numerics; // another options: System.AI
public abstract class EmbeddingVector
{
public virtual EmbeddingVector<T> To<T>();
public static EmbeddingVector FromJson(ReadOnlyMemory<byte> utf8Json);
public static EmbeddingVector FromBase64(ReadOnlyMemory<byte> utf8Base64)
public static EmbeddingVector<T> FromScalars<T>(ReadOnlyMemory<T> scalars)
// possible additions:
// public string ModelName { get; protected set; }
// public string Precision { get; protected set; }
// public abstract int Length { get; }
public abstract void Write(Stream stream, string format);
}
public sealed class EmbeddingVector<T> : EmbeddingVector
{
public EmbeddingVector(ReadOnlyMemory<T> scalars);
public ReadOnlyMemory<T> Scalars { get; }
}
API Usage
EmbeddingVector vector = EmbeddingVector.FromJson("[-0.0026168018,-0.024089903,0.03355637]"u8.ToArray());
EmbeddingVector<float> floats = vector.To<float>();
foreach(float scalar in floats.Scalars.Span)
{
Console.WriteLine(scalar);
}
Here's how we can use it with OpenAI, which returns a base64 encoded string:
EmbeddingClient client = new("text-embedding-ada-002", Environment.GetEnvironmentVariable("OPENAI-API-KEY"));
ClientResult<Embedding> response = client.GenerateEmbedding("Top hotel in town");
And here's how we can use it with Azure Search, which returns a JSON array:
// Get embedding from OpenAI
EmbeddingClient client = new("text-embedding-ada-002", Environment.GetEnvironmentVariable("OPENAI-API-KEY"));
Embedding embedding = client.GenerateEmbedding("Top hotel in town");
EmbeddingVector vector = embedding.Vector;
// Call Azure AI Search passing in the vector
Uri endpoint = new(Environment.GetEnvironmentVariable("SEARCH_ENDPOINT"));
AzureKeyCredential credential = new AzureKeyCredential(Environment.GetEnvironmentVariable("SEARCH_API_KEY"));
SearchClient searchClient = new SearchClient(endpoint, "mysearchindex", credential);
Response<SearchResults<Hotel>> response = searchClient.Search<Hotel>(
new SearchOptions
{
VectorSearch = new()
{
Queries = { new VectorizedQuery(vector) { KNearestNeighborsCount = 3, Fields = { "DescriptionVector" } } }
}
});
For end-to-end working examples, please see: EmbeddingType/Program.cs
Alternative Designs
No response
Risks
No response
Discussion Points
- Should FromBase64 take a parameter specifying endianness?
- Which datatypes in addition to float and Half should we support? Int16? Byte, SByte?
- Should we take endianness parameter to FromBase64? Supposedly OpenAI sends floats little-endian
- Do we want specialized vector of bits?
- Do we want to add support for slicing?
- Do we like the name EmbeddingVector? Do we like the namespace?
- What's the package this ships in?
- Do we want to add property ModelName? Lenght? Precision? any others.