vecs allows you to associate key-value pairs of metadata with indexes and ids in your collections. You can then add filters to queries that reference the metadata metadata.
Metadata is stored as binary JSON. As a result, allowed metadata types are drawn from JSON primitive types.
- Boolean
- String
- Number
The technical limit of a metadata field associated with a vector is 1GB. In practice you should keep metadata fields as small as possible to maximize performance.
The metadata query language is based loosely on mongodb's selectors.
vecs
currently supports a subset of those operators.
Comparison operators compare a provided value with a value stored in metadata field of the vector store.
Operator | Description |
---|---|
$eq | Matches values that are equal to a specified value |
$ne | Matches values that are not equal to a specified value |
$gt | Matches values that are greater than a specified value |
$gte | Matches values that are greater than or equal to a specified value |
$lt | Matches values that are less than a specified value |
$lte | Matches values that are less than or equal to a specified value |
$in | Matches values that are contained by scalar list of specified values |
$contains | Matches values where a scalar is contained within an array metadata field |
Logical operators compose other operators, and can be nested.
Operator | Description |
---|---|
$and | Joins query clauses with a logical AND returns all documents that match the conditions of both clauses. |
$or | Joins query clauses with a logical OR returns all documents that match the conditions of either clause. |
For best performance, use scalar key-value pairs for metadata and prefer $eq
, $and
and $or
filters where possible.
Those variants are most consistently able to make use of indexes.
year
equals 2020
{"year": {"$eq": 2020}}
year
equals 2020 or gross
greater than or equal to 5000.0
{
"$or": [
{"year": {"$eq": 2020}},
{"gross": {"$gte": 5000.0}}
]
}
last_name
is less than "Brown" and is_priority_customer
is true
{
"$and": [
{"last_name": {"$lt": "Brown"}},
{"is_priority_customer": {"$gte": 5000.00}}
]
}
priority
contained by ["enterprise", "pro"]
{
"priority": {"$in": ["enterprise", "pro"]}
}
tags
, an array, contains the string "important"
{
"tags": {"$contains": "important"}
}