Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cosmosdbnosql: Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook #24424

Open
wants to merge 60 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
80936bf
Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyt…
gsa9989 Jul 19, 2024
01b7bd1
Merged latest master from upstream and resolved all conflicts
gsa9989 Jul 19, 2024
e891cd9
Removed openai_api_key parameter
gsa9989 Jul 19, 2024
676d762
Removed unnecessary space changes
gsa9989 Jul 19, 2024
7f00b8a
Merge branch 'users/garagundi/cosmosdbnosql' into users/akataria/rebase
aayush3011 Aug 7, 2024
66289ed
Merge pull request #1 from gsa9989/users/akataria/rebase
aayush3011 Aug 7, 2024
bb55a4b
Merge branch 'langchain-ai:master' into users/garagundi/cosmosdbnosql
aayush3011 Aug 7, 2024
2184d8e
Rebase from master
gsa9989 Aug 7, 2024
d9f684b
Merge branch 'users/garagundi/cosmosdbnosql' of https://github.com/gs…
gsa9989 Aug 7, 2024
4701fb7
test updates
gsa9989 Aug 8, 2024
d23bcd6
format
ccurme Aug 27, 2024
043f2bf
Merge branch 'master' into users/garagundi/cosmosdbnosql
ccurme Aug 27, 2024
4f58256
lint
ccurme Aug 27, 2024
cd84f02
Merge branch 'langchain-ai:master' into users/garagundi/cosmosdbnosql
aayush3011 Aug 27, 2024
aa1e846
linting
aayush3011 Aug 27, 2024
636415b
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Aug 27, 2024
cd744f0
linting
aayush3011 Aug 27, 2024
059be6a
Merge branch 'users/garagundi/cosmosdbnosql' of github.com:gsa9989/la…
aayush3011 Aug 27, 2024
c625e7d
linting
aayush3011 Aug 27, 2024
31b7c1b
linting
aayush3011 Aug 27, 2024
98d9d7e
linting
aayush3011 Aug 27, 2024
2c4d2d3
linting
aayush3011 Aug 27, 2024
04ac7ae
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 3, 2024
2c4f281
Linting
aayush3011 Sep 3, 2024
0caa7cd
Linting
aayush3011 Sep 3, 2024
9766154
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 3, 2024
8891976
Linting
aayush3011 Sep 3, 2024
eafbcb8
Merge branch 'users/garagundi/cosmosdbnosql' of github.com:gsa9989/la…
aayush3011 Sep 3, 2024
dbe1504
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 3, 2024
9ffa53f
Linting
aayush3011 Sep 3, 2024
eb1f8bf
Linting
aayush3011 Sep 3, 2024
3599624
Linting
aayush3011 Sep 3, 2024
f18a5d3
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 3, 2024
f2f1ff5
Linting
aayush3011 Sep 3, 2024
1e818da
Linting
aayush3011 Sep 3, 2024
0441131
Linting
aayush3011 Sep 3, 2024
810dd00
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 3, 2024
c3d0917
Linting
aayush3011 Sep 3, 2024
a748a58
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 3, 2024
81887e5
Merge branch 'langchain-ai:master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 25, 2024
860b0c0
Adding notebook sample
aayush3011 Sep 25, 2024
3378365
linting
aayush3011 Sep 25, 2024
fe08580
linting
aayush3011 Sep 25, 2024
cb02b1a
linting
aayush3011 Sep 25, 2024
bc8ee2d
linting
aayush3011 Sep 25, 2024
a5eebd9
linting
aayush3011 Sep 25, 2024
5f2c91f
linting
aayush3011 Sep 26, 2024
cce4959
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 26, 2024
0441fa7
linting
aayush3011 Sep 26, 2024
d003423
Merge branch 'users/garagundi/cosmosdbnosql' of github.com:gsa9989/la…
aayush3011 Sep 26, 2024
05fd438
linting
aayush3011 Sep 26, 2024
9e7838c
linting
aayush3011 Sep 26, 2024
f4250ac
linting
aayush3011 Sep 26, 2024
c13660c
linting
aayush3011 Sep 26, 2024
3f23135
linting
aayush3011 Sep 26, 2024
7fc0f59
Adding support for managed identity for cosmosdb nosql VS
aayush3011 Sep 26, 2024
224393c
linting
aayush3011 Sep 26, 2024
f87f18d
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Sep 26, 2024
cbd6f2b
Adding user agent for vector store
aayush3011 Oct 10, 2024
1ce7bbf
Merge branch 'master' into users/garagundi/cosmosdbnosql
aayush3011 Oct 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/docs/integrations/llm_caching.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2806,4 +2806,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
102 changes: 101 additions & 1 deletion libs/community/langchain_community/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@
)
from langchain_community.vectorstores.utils import DistanceStrategy

# from libs.community.langchain_community.vectorstores.azure_cosmos_db_no_sql import AzureCosmosDBNoSqlVectorSearch

Check failure on line 63 in libs/community/langchain_community/cache.py

View workflow job for this annotation

GitHub Actions / cd libs/community / make lint #3.12

Ruff (E501)

langchain_community/cache.py:63:89: E501 Line too long (115 > 88)

Check failure on line 63 in libs/community/langchain_community/cache.py

View workflow job for this annotation

GitHub Actions / cd libs/community / make lint #3.8

Ruff (E501)

langchain_community/cache.py:63:89: E501 Line too long (115 > 88)

try:
from sqlalchemy.orm import declarative_base
except ImportError:
Expand All @@ -80,7 +82,10 @@
from langchain_community.utilities.astradb import (
_AstraDBCollectionEnvironment,
)
from langchain_community.vectorstores import AzureCosmosDBVectorSearch
from langchain_community.vectorstores import (
AzureCosmosDBNoSqlVectorSearch,
AzureCosmosDBVectorSearch,
)
from langchain_community.vectorstores import (
OpenSearchVectorSearch as OpenSearchVectorStore,
)
Expand Down Expand Up @@ -2275,6 +2280,101 @@
raise ValueError(f"Invalid enum value: {value}. Expected {enum_type}.")


class AzureCosmosDBNoSqlSemanticCache(BaseCache):
"""Cache that uses Cosmos DB NoSQL backend"""

def __init__(
self,
embedding: Embeddings,
cosmos_client: Optional[Any] = None,
database_name: str = "CosmosNoSqlCacheDB",
container_name: str = "CosmosNoSqlCacheContainer",
*,
vector_embedding_policy: Optional[Dict[str, Any]] = None,
indexing_policy: Optional[Dict[str, Any]] = None,
cosmos_container_properties: Dict[str, Any],
cosmos_database_properties: Dict[str, Any],
):
self.cosmos_client = cosmos_client
self.database_name = database_name
self.container_name = container_name
self.embedding = embedding
self.vector_embedding_policy = vector_embedding_policy
self.indexing_policy = indexing_policy
self.cosmos_container_properties = cosmos_container_properties
self.cosmos_database_properties = cosmos_database_properties
self._cache_: Optional[AzureCosmosDBNoSqlVectorSearch] = None

def _create_llm_cache(self, llm_string: str) -> AzureCosmosDBNoSqlVectorSearch:
# create new vectorstore client to create the cache
if self.cosmos_client:
self._cache_ = AzureCosmosDBNoSqlVectorSearch(
cosmos_client=self.cosmos_client,
embedding=self.embedding,
vector_embedding_policy=self.vector_embedding_policy,
indexing_policy=self.indexing_policy,
cosmos_container_properties=self.cosmos_container_properties,
cosmos_database_properties=self.cosmos_database_properties,
database_name=self.database_name,
container_name=self.container_name,
)

return self._cache_

def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
"""Look up based on prompt."""
if not self._cache_:
self._cache_ = self._create_llm_cache(llm_string)
llm_cache = self._cache_
generations: List = []
# Read from a Hash
results = llm_cache.similarity_search(
query=prompt,
k=1,
)
if results:
for document in results:
try:
generations.extend(loads(document.metadata["return_val"]))
except Exception:
logger.warning(
"Retrieving a cache value that could not be deserialized "
"properly. This is likely due to the cache being in an "
"older format. Please recreate your cache to avoid this "
"error."
)
# In a previous life we stored the raw text directly
# in the table, so assume it's in that format.
ccurme marked this conversation as resolved.
Show resolved Hide resolved
generations.extend(
_load_generations_from_json(document.metadata["return_val"])
)
return generations if generations else None

def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> None:
"""Update cache based on prompt and llm_string."""
for gen in return_val:
if not isinstance(gen, Generation):
raise ValueError(
"CosmosDBNoSqlSemanticCache only supports caching of "
f"normal LLM generations, got {type(gen)}"
)
if not self._cache_:
self._cache_ = self._create_llm_cache(llm_string)
llm_cache = self._cache_
metadata = {
"llm_string": llm_string,
"prompt": prompt,
"return_val": dumps([g for g in return_val]),
}
llm_cache.add_texts(texts=[prompt], metadatas=[metadata])

def clear(self, **kwargs: Any) -> None:
"""Clear semantic cache for a given llm_string."""
database = self.cosmos_client.get_database_client(self.database_name)
container = database.get_container_client(self.container_name)

Check failure on line 2374 in libs/community/langchain_community/cache.py

View workflow job for this annotation

GitHub Actions / cd libs/community / make lint #3.12

Ruff (F841)

langchain_community/cache.py:2374:9: F841 Local variable `container` is assigned to but never used

Check failure on line 2374 in libs/community/langchain_community/cache.py

View workflow job for this annotation

GitHub Actions / cd libs/community / make lint #3.8

Ruff (F841)

langchain_community/cache.py:2374:9: F841 Local variable `container` is assigned to but never used
ccurme marked this conversation as resolved.
Show resolved Hide resolved
database.delete_container(self.container_name)


class OpenSearchSemanticCache(BaseCache):
"""Cache that uses OpenSearch vector store backend"""

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
"""Test Azure CosmosDB NoSql cache functionality."""

import os
import uuid

import pytest
from azure.cosmos import CosmosClient, PartitionKey
from langchain.globals import get_llm_cache, set_llm_cache
from langchain_core.outputs import Generation
from libs.community.tests.integration_tests.cache.fake_embeddings import (
FakeEmbeddings,
)
from libs.community.tests.unit_tests.llms.fake_llm import FakeLLM

from langchain_community.cache import AzureCosmosDBNoSqlSemanticCache
from langchain_community.vectorstores import AzureCosmosDBNoSqlVectorSearch

URI = "COSMOS_DB_URI"
KEY = "COSMOS_DB_KEY"
test_client = CosmosClient(URI, credential=KEY)


# cosine, euclidean, innerproduct
def indexing_policy(index_type: str):
return {
"indexingMode": "consistent",
"includedPaths": [{"path": "/*"}],
"excludedPaths": [{"path": '/"_etag"/?'}],
"vectorIndexes": [{"path": "/embedding", "type": index_type}],
}


def vector_embedding_policy(distance_function: str):
return {
"vectorEmbeddings": [
{
"path": "/embedding",
"dataType": "float32",
"distanceFunction": distance_function,
"dimensions": 1536,
}
]
}


partition_key = PartitionKey(path="/id")
cosmos_container_properties_test = {"partition_key": partition_key}
cosmos_database_properties_test = {}


# @pytest.fixture(scope="session")
ccurme marked this conversation as resolved.
Show resolved Hide resolved
def test_azure_cosmos_db_nosql_semantic_cache_cosine_quantizedflat() -> None:
set_llm_cache(
AzureCosmosDBNoSqlSemanticCache(
cosmos_client=test_client,
embedding=FakeEmbeddings(),
vector_embedding_policy=vector_embedding_policy("cosine"),
indexing_policy=indexing_policy("quantizedFlat"),
cosmos_container_properties=cosmos_container_properties_test,
cosmos_database_properties=cosmos_database_properties_test,
)
)

llm = FakeLLM()
params = llm.dict()
params["stop"] = None
llm_string = str(sorted([(k, v) for k, v in params.items()]))
get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])

# foo and bar will have the same embedding produced by FakeEmbeddings
cache_output = get_llm_cache().lookup("bar", llm_string)
assert cache_output == [Generation(text="fizz")]

# clear the cache
get_llm_cache().clear(llm_string=llm_string)


def test_azure_cosmos_db_nosql_semantic_cache_cosine_flat() -> None:
set_llm_cache(
AzureCosmosDBNoSqlSemanticCache(
cosmos_client=test_client,
embedding=FakeEmbeddings(),
vector_embedding_policy=vector_embedding_policy("cosine"),
indexing_policy=indexing_policy("flat"),
cosmos_container_properties=cosmos_container_properties_test,
cosmos_database_properties=cosmos_database_properties_test,
)
)

llm = FakeLLM()
params = llm.dict()
params["stop"] = None
llm_string = str(sorted([(k, v) for k, v in params.items()]))
get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])

# foo and bar will have the same embedding produced by FakeEmbeddings
cache_output = get_llm_cache().lookup("bar", llm_string)
assert cache_output == [Generation(text="fizz")]

# clear the cache
get_llm_cache().clear(llm_string=llm_string)


def test_azure_cosmos_db_nosql_semantic_cache_dotproduct_quantizedflat() -> None:
set_llm_cache(
AzureCosmosDBNoSqlSemanticCache(
cosmos_client=test_client,
embedding=FakeEmbeddings(),
vector_embedding_policy=vector_embedding_policy("dotProduct"),
indexing_policy=indexing_policy("quantizedFlat"),
cosmos_container_properties=cosmos_container_properties_test,
cosmos_database_properties=cosmos_database_properties_test,
)
)

llm = FakeLLM()
params = llm.dict()
params["stop"] = None
llm_string = str(sorted([(k, v) for k, v in params.items()]))
get_llm_cache().update(
"foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
)

# foo and bar will have the same embedding produced by FakeEmbeddings
cache_output = get_llm_cache().lookup("bar", llm_string)
assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]

# clear the cache
get_llm_cache().clear(llm_string=llm_string)


def test_azure_cosmos_db_nosql_semantic_cache_dotproduct_flat() -> None:
set_llm_cache(
AzureCosmosDBNoSqlSemanticCache(
cosmos_client=test_client,
embedding=FakeEmbeddings(),
vector_embedding_policy=vector_embedding_policy("dotProduct"),
indexing_policy=indexing_policy("flat"),
cosmos_container_properties=cosmos_container_properties_test,
cosmos_database_properties=cosmos_database_properties_test,
)
)

llm = FakeLLM()
params = llm.dict()
params["stop"] = None
llm_string = str(sorted([(k, v) for k, v in params.items()]))
get_llm_cache().update(
"foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
)

# foo and bar will have the same embedding produced by FakeEmbeddings
cache_output = get_llm_cache().lookup("bar", llm_string)
assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]

# clear the cache
get_llm_cache().clear(llm_string=llm_string)


def test_azure_cosmos_db_nosql_semantic_cache_euclidean_quantizedflat() -> None:
set_llm_cache(
AzureCosmosDBNoSqlSemanticCache(
cosmos_client=test_client,
embedding=FakeEmbeddings(),
vector_embedding_policy=vector_embedding_policy("euclidean"),
indexing_policy=indexing_policy("quantizedFlat"),
cosmos_container_properties=cosmos_container_properties_test,
cosmos_database_properties=cosmos_database_properties_test,
)
)

llm = FakeLLM()
params = llm.dict()
params["stop"] = None
llm_string = str(sorted([(k, v) for k, v in params.items()]))
get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])

# foo and bar will have the same embedding produced by FakeEmbeddings
cache_output = get_llm_cache().lookup("bar", llm_string)
assert cache_output == [Generation(text="fizz")]

# clear the cache
get_llm_cache().clear(llm_string=llm_string)


def test_azure_cosmos_db_nosql_semantic_cache_euclidean_flat() -> None:
set_llm_cache(
AzureCosmosDBNoSqlSemanticCache(
cosmos_client=test_client,
embedding=FakeEmbeddings(),
vector_embedding_policy=vector_embedding_policy("euclidean"),
indexing_policy=indexing_policy("flat"),
cosmos_container_properties=cosmos_container_properties_test,
cosmos_database_properties=cosmos_database_properties_test,
)
)

llm = FakeLLM()
params = llm.dict()
params["stop"] = None
llm_string = str(sorted([(k, v) for k, v in params.items()]))
get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])

# foo and bar will have the same embedding produced by FakeEmbeddings
cache_output = get_llm_cache().lookup("bar", llm_string)
assert cache_output == [Generation(text="fizz")]

# clear the cache
get_llm_cache().clear(llm_string=llm_string)
Loading