Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cosmosdbnosql: Add Semantic Cache Integration #7033

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

aditishree1
Copy link

This PR adds following changes:

  • adds Azure Cosmos DB NoSQL SemanticCache Integration.
  • changes the User Agent Suffix for AzureCosmosDBNoSQLVectorStore to "LangChain-CDBNoSQL-VectorStore-JavaScript"

Copy link

vercel bot commented Oct 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Oct 21, 2024 7:32am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Oct 21, 2024 7:32am

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. auto:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Oct 21, 2024
@aditishree1 aditishree1 changed the title cosmosdbnosql: Add semantic cache Integration cosmosdbnosql: Add Semantic Cache Integration Oct 21, 2024
@@ -78,7 +78,7 @@ export interface AzureCosmosDBNoSQLConfig
readonly metadataKey?: string;
}

const USER_AGENT_PREFIX = "langchainjs-azure-cosmosdb-nosql";
const USER_AGENT_SUFFIX = "LangChain-CDBNoSQL-VectorStore-JavaScript";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why you made this change?
The naming is following a common pattern we have for all our JS integrations, which makes it easier to use filters: <framework>-<integration_name>
If we need a distinction, it would be best to use <framework>-<integration>-<type>, ie langchainjs-azure-cosmosdb-nosql-vectorstore

Copy link

@aayush3011 aayush3011 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sinedied we are following a user agent pattern across all AI integrations, and we came up with "LangChain-CDBNoSQL-VectorStore-JavaScript"

AzureCosmosDBNoSQLVectorStore,
} from "./azure_cosmosdb_nosql.js";

const USER_AGENT_SUFFIX = "LangChain-CDBNoSQL-SemanticCache-JavaScript";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, it would be preferable to use langchainjs-azure-cosmosdb-nosql-semanticcache (see previous comment)

Copy link

@aayush3011 aayush3011 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sinedied we are following a user agent pattern across all AI integrations, and we came up with "LangChain-CDBNoSQL-VectorStore-JavaScript"

@@ -68,7 +68,7 @@ export interface AzureCosmosDBNoSQLInitOptions {
*/
export interface AzureCosmosDBNoSQLConfig
extends AzureCosmosDBNoSQLInitOptions {
readonly client?: CosmosClient;
client?: CosmosClient;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client should not be overridable by the user aside from the constructor, please keep it read-only. When creating the new client in the semantic cache, you can use the constructor to use it.

private getLlmCache(llmKey: string) {
const key = getCacheKey(llmKey);
if (!this.cacheDict[key]) {
this.cacheDict[key] = new AzureCosmosDBNoSQLVectorStore(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state of this implementation, this use the same default container name as for the VectorStore, which can be problematic:

For example, if a user uses default values and have a vector store for RAG and semantic cache, the results will get mixed up.

I suggest 2 changes:

  • Add a metadata to indicate that the document is used for semantic caching (so users wants to use a single table, they can use filters to distinguish cache vs documents)
  • Use a different container name for the semantic caching by default, for example vectorSearchContainer to avoid conflicts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer adding a different name for the semantic caching container. We are doing the same in LangChain python semantic cache as well. Let's keep the vector search and semantic cache container different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants