Skip to content

RDoc-3343 Vector search: Querying multiple vector fields #2043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
* [Dynamic vector search - exact search](../../ai-integration/vector-search/vector-search-using-dynamic-query#dynamic-vector-search---exact-search)
* [Quantization options](../../ai-integration/vector-search/vector-search-using-dynamic-query#quantization-options)
* [Querying vector fields and regular data in the same query](../../ai-integration/vector-search/vector-search-using-dynamic-query#querying-vector-fields-and-regular-data-in-the-same-query)
* [Combining multiple vector searches in the same query](../../ai-integration/vector-search/vector-search-using-dynamic-query#combining-multiple-vector-searches-in-the-same-query)
* [Syntax](../../ai-integration/vector-search/vector-search-using-dynamic-query#syntax)

{NOTE/}
Expand Down Expand Up @@ -655,6 +656,45 @@ and (vector.search(embedding.text(Name), $searchTerm, 0.75, 25))
{INFO/}
{PANEL/}

{PANEL: Combining multiple vector searches in the same query}

* You can combine multiple vector search statements in the same query using logical operators.
This is useful when you want to retrieve documents that match more than one vector-based criterion.

* This can be done using [DocumentQuery](../../client-api/session/querying/how-to-query#session.advanced.documentquery),
[RawQuery](../../client-api/session/querying/how-to-query#session.advanced.rawquery) or raw [RQL](../../client-api/session/querying/what-is-rql).

* In the example below, the results will include companies that match one of two vector search conditions:
* Companies from European countries with a _Name_ similar to "snack"
* Or companies with a _Name_ similar to "dairy"

{CODE-TABS}
{CODE-TAB:csharp:DocumentQuery vs_27@AiIntegration\VectorSearch\VectorSearchUsingDynamicQuery.cs /}
{CODE-TAB:csharp:DocumentQuery_async vs_27_async@AiIntegration\VectorSearch\VectorSearchUsingDynamicQuery.cs /}
{CODE-TAB:csharp:RawQuery vs_28@AiIntegration\VectorSearch\VectorSearchUsingDynamicQuery.cs /}
{CODE-TAB:csharp:RawQuery_async vs_28_async@AiIntegration\VectorSearch\VectorSearchUsingDynamicQuery.cs /}
{CODE-TAB-BLOCK:sql:RQL}
from "Companies"
where
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An important aspect here is that vector.search() will only return a set amount of results (by default - IIRC, is 16.
We evaluate each clause independently, so you may have this:

vector.search(embedding.text(Name), $searchTerm1, 0.78)
  and
  vector.search(embedding.text(Address.Country), $searchTerm2, 0.82)

Where the first clause gives [1,2,3...16] results and the second gives [18,... 34]` results. And there won't be a match between them.

However, if we will continue deeper in the first clause, we'll get to 18 which would be a match.

The problem is that we must run the HNSW on each vector independently, and the size matters a lot.

(
vector.search(embedding.text(Name), $searchTerm1, 0.78)
and
vector.search(embedding.text(Address.Country), $searchTerm2, 0.82)
)
or
(
vector.search(embedding.text(Name), $searchTerm3, 0.80)
)
{"searchTerm1" : "snack", "searchTerm2" : "europe", "searchTerm3" : "dairy"}
{CODE-TAB-BLOCK/}
{CODE-TABS/}

* Running the above query example on the RavenDB sample data will generate the following auto-index:
`Auto/Companies/ByVector.search(embedding.text(Address.Country))AndVector.search(embedding.text(Name))`.
This index includes two vector fields: _Address.Country_ and _Name_.

{PANEL/}

{PANEL: Syntax}

`VectorSearch`:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -895,6 +895,126 @@ where vector.search(embedding.text(Name), embedding.forDoc($documentID), 0.82)")
.ToListAsync();
#endregion
}

// Examples for "multiple vector searches in the same query"
// =========================================================

using (var session = store.OpenSession())
{
#region vs_27
var companies = session.Advanced
.DocumentQuery<Company>()
// Use OpenSubclause & CloseSubclause to differentiate between clauses:
// ====================================================================

.OpenSubclause()
.VectorSearch( // Search for companies that sell snacks or similar
field => field.WithText(x => x.Name),
searchTerm => searchTerm.ByText("snack"),
minimumSimilarity: 0.78f
)
// Use 'AndAlso' for an AND operation
.AndAlso()
.VectorSearch( // Search for companies located in Europe
field => field.WithText(x => x.Address.Country),
searchTerm => searchTerm.ByText("europe"),
minimumSimilarity: 0.82f
)
.CloseSubclause()
// Use 'OrElse' for an OR operation
.OrElse()
.OpenSubclause()
.VectorSearch( // Search for companies that sell dairy products or similar
field => field.WithText(x => x.Name),
v => v.ByText("dairy"),
minimumSimilarity: 0.80f
)
.CloseSubclause()
.WaitForNonStaleResults()
.ToList();
#endregion
}

using (var asyncSession = store.OpenAsyncSession())
{
#region vs_27_async
var companies = await asyncSession.Advanced
.AsyncDocumentQuery<Company>()
.OpenSubclause()
.VectorSearch(
field => field.WithText(x => x.Name),
searchTerm => searchTerm.ByText("snack"),
minimumSimilarity: 0.78f
)
.AndAlso()
.VectorSearch(
field => field.WithText(x => x.Address.Country),
searchTerm => searchTerm.ByText("europe"),
minimumSimilarity: 0.82f
)
.CloseSubclause()
.OrElse()
.OpenSubclause()
.VectorSearch(
field => field.WithText(x => x.Name),
searchTerm => searchTerm.ByText("dairy"),
minimumSimilarity: 0.80f
)
.CloseSubclause()
.WaitForNonStaleResults()
.ToListAsync();
#endregion
}

using (var session = store.OpenSession())
{
#region vs_28
var companies = session.Advanced
.RawQuery<Company>(@"
from Companies
where
(
vector.search(embedding.text(Name), $searchTerm1, 0.78)
and
vector.search(embedding.text(Address.Country), $searchTerm2, 0.82)
)
or
(
vector.search(embedding.text(Name), $searchTerm3, 0.80)
)
")
.AddParameter("searchTerm1", "snack")
.AddParameter("searchTerm2","europe")
.AddParameter("searchTerm3", "dairy")
.WaitForNonStaleResults()
.ToList();
#endregion
}

using (var asyncSession = store.OpenAsyncSession())
{
#region vs_28_async
var companies = await asyncSession.Advanced
.AsyncRawQuery<Company>(@"
from Companies
where
(
vector.search(embedding.text(Name), $searchTerm1, 0.78)
and
vector.search(embedding.text(Address.Country), $searchTerm2, 0.82)
)
or
(
vector.search(embedding.text(Name), $searchTerm3, 0.80)
)
")
.AddParameter("searchTerm1", "snack")
.AddParameter("searchTerm2","europe")
.AddParameter("searchTerm3", "dairy")
.WaitForNonStaleResults()
.ToListAsync();
#endregion
}
}
}

Expand Down
Loading