You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We work in a full GCP environment with Vertex Search (matching engine as Vector DB).
And one of the drawback of this solution is the need to keep a side index of what is currently in the DB and the complexity to manage update and deletion without creating duplicate documents.
Only works with LangChain vectorstore's that support:
document addition by id (add_documents method with ids argument)
delete by id (delete method with ids argument)
So it is the case for Vertex Search (in streaming mode).
Hello,
We work in a full GCP environment with Vertex Search (matching engine as Vector DB).
And one of the drawback of this solution is the need to keep a side index of what is currently in the DB and the complexity to manage update and deletion without creating duplicate documents.
I feel that implementing RecordManager for Bigquery would solve all this problem and allow to easily track what's in the vector db:
https://python.langchain.com/docs/how_to/indexing/
https://api.python.langchain.com/en/latest/indexes/langchain.indexes.base.RecordManager.html
The langchain documentation say:
So it is the case for Vertex Search (in streaming mode).
There is no RecordManager yet in GCP. A PR for firestore is ongoing (googleapis/langchain-google-firestore-python#90) but I feel like BigQuery might be more suitable for this use case.
Happy to discuss the implementation and suitability :)
The text was updated successfully, but these errors were encountered: