Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community[minor]: Add Dria retriever #4302

Merged
merged 3 commits into from
Feb 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions docs/core_docs/docs/integrations/retrievers/dria.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
hide_table_of_contents: true
---

# Dria Retriever

The [Dria](https://dria.co/profile) retriever allows an agent to perform a text-based search across a comprehensive knowledge hub.

## Setup

To use Dria retriever, first install Dria JS client:

```bash npm2yarn
npm install dria
```

You need to provide two things to the retriever:

- **API Key**: you can get yours at your [profile page](https://dria.co/profile) when you create an account.
- **Contract ID**: accessible at the top of the page when viewing a knowledge or in its URL.
For example, the Bitcoin whitepaper is uploaded on Dria at https://dria.co/knowledge/2KxNbEb040GKQ1DSDNDsA-Fsj_BlQIEAlzBNuiapBR0, so its contract ID is `2KxNbEb040GKQ1DSDNDsA-Fsj_BlQIEAlzBNuiapBR0`.
Contract ID can be omitted during instantiation, and later be set via `dria.contractId = "your-contract"`

Dria retriever exposes the underlying [Dria client](https://npmjs.com/package/dria) as well, refer to the [Dria documentation](https://github.com/firstbatchxyz/dria-js-client?tab=readme-ov-file#usage) to learn more about the client.

## Usage

import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

```bash npm2yarn
npm install dria @langchain/community
```

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/retrievers/dria.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>
14 changes: 14 additions & 0 deletions examples/src/retrievers/dria.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import { DriaRetriever } from "@langchain/community/retrievers/dria";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! I've reviewed the code and noticed that the new changes introduce a net-new external HTTP request using fetch or axios when calling the getRelevantDocuments method. I've flagged this for your review to ensure it aligns with the project's requirements. Let me know if you have any questions or need further clarification!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on the PR! I've flagged this change for review as it appears to explicitly access an environment variable via process.env or getEnvironmentVariable for the DRIA_API_KEY. Please review and ensure it aligns with our security and best practices.


// contract of TypeScript Handbook v4.9 uploaded to Dria
// https://dria.co/knowledge/-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0
const contractId = "-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0";

const retriever = new DriaRetriever({
contractId, // a knowledge to connect to
apiKey: "DRIA_API_KEY", // if not provided, will check env for `DRIA_API_KEY`
topK: 15, // optional: default value is 10
});

const docs = await retriever.getRelevantDocuments("What is a union type?");
console.log(docs);
4 changes: 4 additions & 0 deletions libs/langchain-community/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -506,6 +506,10 @@ retrievers/databerry.cjs
retrievers/databerry.js
retrievers/databerry.d.ts
retrievers/databerry.d.cts
retrievers/dria.cjs
retrievers/dria.js
retrievers/dria.d.ts
retrievers/dria.d.cts
retrievers/metal.cjs
retrievers/metal.js
retrievers/metal.d.ts
Expand Down
7 changes: 4 additions & 3 deletions libs/langchain-community/langchain.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,8 @@ function abs(relativePath) {
return resolve(dirname(fileURLToPath(import.meta.url)), relativePath);
}


export const config = {
internals:[
internals: [
/node\:/,
/@langchain\/core\//,
"convex",
Expand Down Expand Up @@ -161,6 +160,7 @@ export const config = {
"retrievers/amazon_knowledge_base": "retrievers/amazon_knowledge_base",
"retrievers/chaindesk": "retrievers/chaindesk",
"retrievers/databerry": "retrievers/databerry",
"retrievers/dria": "retrievers/dria",
"retrievers/metal": "retrievers/metal",
"retrievers/remote": "retrievers/remote/index",
"retrievers/supabase": "retrievers/supabase",
Expand Down Expand Up @@ -298,6 +298,7 @@ export const config = {
"chat_models/iflytek_xinghuo/web",
"retrievers/amazon_kendra",
"retrievers/amazon_knowledge_base",
"retrievers/dria",
"retrievers/metal",
"retrievers/supabase",
"retrievers/vectara_summary",
Expand Down Expand Up @@ -340,4 +341,4 @@ export const config = {
cjsSource: "./dist-cjs",
cjsDestination: "./dist",
abs,
}
};
18 changes: 18 additions & 0 deletions libs/langchain-community/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@
"discord.js": "^14.14.1",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! I noticed that the latest PR adds a new dependency "dria" to the package.json file. This change is flagged for maintainers to review the addition of this regular dependency. Great work, and looking forward to the review!

"dotenv": "^16.0.3",
"dpdm": "^3.12.0",
"dria": "^0.0.3",
"eslint": "^8.33.0",
"eslint-config-airbnb-base": "^15.0.0",
"eslint-config-prettier": "^8.6.0",
Expand Down Expand Up @@ -218,6 +219,7 @@
"cohere-ai": "*",
"convex": "^1.3.1",
"discord.js": "^14.14.1",
"dria": "^0.0.3",
"faiss-node": "^0.5.1",
"firebase-admin": "^11.9.0",
"google-auth-library": "^8.9.0",
Expand Down Expand Up @@ -405,6 +407,9 @@
"discord.js": {
"optional": true
},
"dria": {
"optional": true
},
"faiss-node": {
"optional": true
},
Expand Down Expand Up @@ -1640,6 +1645,15 @@
"import": "./retrievers/databerry.js",
"require": "./retrievers/databerry.cjs"
},
"./retrievers/dria": {
"types": {
"import": "./retrievers/dria.d.ts",
"require": "./retrievers/dria.d.cts",
"default": "./retrievers/dria.d.ts"
},
"import": "./retrievers/dria.js",
"require": "./retrievers/dria.cjs"
},
"./retrievers/metal": {
"types": {
"import": "./retrievers/metal.d.ts",
Expand Down Expand Up @@ -2539,6 +2553,10 @@
"retrievers/databerry.js",
"retrievers/databerry.d.ts",
"retrievers/databerry.d.cts",
"retrievers/dria.cjs",
"retrievers/dria.js",
"retrievers/dria.d.ts",
"retrievers/dria.d.cts",
"retrievers/metal.cjs",
"retrievers/metal.js",
"retrievers/metal.d.ts",
Expand Down
122 changes: 122 additions & 0 deletions libs/langchain-community/src/retrievers/dria.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey team, just a heads up that I've flagged this PR for review because it introduces code that explicitly adds, accesses, reads, and requires an environment variable via getEnvironmentVariable. It's important to ensure that the handling of environment variables aligns with best practices and security considerations. Great work on the code changes!

BaseRetriever,
type BaseRetrieverInput,
} from "@langchain/core/retrievers";
import { Document } from "@langchain/core/documents";
import { getEnvironmentVariable } from "@langchain/core/utils/env";
import type { DriaParams, SearchOptions as DriaSearchOptions } from "dria";
import { Dria } from "dria";

/**
* Configurations for Dria retriever.
*
* - `contractId`: a Dria knowledge's contract ID.
* - `apiKey`: a Dria API key; if omitted, the retriever will check for `DRIA_API_KEY` environment variable.
*
* The retrieval can be configured with the following options:
*
* - `topK`: number of results to return, max 20. (default: 10)
* - `rerank`: re-rank the results from most to least semantically relevant to the given search query. (default: true)
* - `level`: level of detail for the search, must be an integer from 0 to 5 (inclusive). (default: 1)
* - `field`: CSV field name, only relevant for the CSV files.
*/
export interface DriaRetrieverArgs
extends DriaParams,
BaseRetrieverInput,
DriaSearchOptions {}

/**
* Class for retrieving documents from knowledge uploaded to Dria.
*
* @example
* ```typescript
* // contract of TypeScript Handbook v4.9 uploaded to Dria
* const contractId = "-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0";
* const retriever = new DriaRetriever({ contractId });
*
* const docs = await retriever.getRelevantDocuments("What is a union type?");
* console.log(docs);
* ```
*/
export class DriaRetriever extends BaseRetriever {
static lc_name() {
return "DriaRetriever";
}

lc_namespace = ["langchain", "retrievers", "dria"];

get lc_secrets() {
return { apiKey: "DRIA_API_KEY" };
}

get lc_aliases() {
return { apiKey: "api_key" };
}

apiKey: string;

public driaClient: Dria;

private searchOptions: DriaSearchOptions;

constructor(fields: DriaRetrieverArgs) {
super(fields);

const apiKey = fields.apiKey ?? getEnvironmentVariable("DRIA_API_KEY");
if (!apiKey) throw new Error("Missing DRIA_API_KEY.");
this.apiKey = apiKey;

this.searchOptions = {
topK: fields.topK,
field: fields.field,
rerank: fields.rerank,
level: fields.level,
};

this.driaClient = new Dria({
contractId: fields.contractId,
apiKey: this.apiKey,
});
}

/**
* Currently connected knowledge on Dria.
*
* Retriever will use this contract ID while retrieving documents,
* and will throw an error if `undefined`.
*
* In the case that this is `undefined`, the user is expected to
* set contract ID manually, such as after creating a new knowledge & inserting
* data there with the Dria client.
*/
get contractId(): string | undefined {
return this.driaClient.contractId;
}

set contractId(value: string) {
this.driaClient.contractId = value;
}

/**
* Retrieves documents from Dria with respect to the configured contract ID, based on
* the given query string.
*
* @param query The query string
* @returns A promise that resolves to an array of documents, with page content as text,
* along with `id` and the relevance `score` within the metadata.
*/
async _getRelevantDocuments(query: string): Promise<Document[]> {
const docs = await this.driaClient.search(query, this.searchOptions);
return docs.map(
(d) =>
new Document({
// dria.search returns a string within the metadata as the content
pageContent: d.metadata,
metadata: {
id: d.id,
score: d.score,
},
})
);
}
}
16 changes: 16 additions & 0 deletions libs/langchain-community/src/retrievers/tests/dria.int.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import { test, expect } from "@jest/globals";
import { DriaRetriever } from "../dria.js";

test.skip("DriaRetriever", async () => {
// contract of TypeScript Handbook v4.9 uploaded to Dria
// https://dria.co/knowledge/-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0
const contractId = "-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0";
const topK = 10;

const retriever = new DriaRetriever({ contractId, topK });

const docs = await retriever.getRelevantDocuments("What is a union type?");
expect(docs.length).toBe(topK);

console.log(docs[0].pageContent);
});
35 changes: 35 additions & 0 deletions yarn.lock
Original file line number Diff line number Diff line change
Expand Up @@ -8932,6 +8932,7 @@ __metadata:
discord.js: ^14.14.1
dotenv: ^16.0.3
dpdm: ^3.12.0
dria: ^0.0.3
eslint: ^8.33.0
eslint-config-airbnb-base: ^15.0.0
eslint-config-prettier: ^8.6.0
Expand Down Expand Up @@ -9032,6 +9033,7 @@ __metadata:
cohere-ai: "*"
convex: ^1.3.1
discord.js: ^14.14.1
dria: ^0.0.3
faiss-node: ^0.5.1
firebase-admin: ^11.9.0
google-auth-library: ^8.9.0
Expand Down Expand Up @@ -9166,6 +9168,8 @@ __metadata:
optional: true
discord.js:
optional: true
dria:
optional: true
faiss-node:
optional: true
firebase-admin:
Expand Down Expand Up @@ -16246,6 +16250,17 @@ __metadata:
languageName: node
linkType: hard

"axios@npm:^1.6.5":
version: 1.6.7
resolution: "axios@npm:1.6.7"
dependencies:
follow-redirects: ^1.15.4
form-data: ^4.0.0
proxy-from-env: ^1.1.0
checksum: 87d4d429927d09942771f3b3a6c13580c183e31d7be0ee12f09be6d5655304996bb033d85e54be81606f4e89684df43be7bf52d14becb73a12727bf33298a082
languageName: node
linkType: hard

"axobject-query@npm:^3.1.1, axobject-query@npm:^3.2.1":
version: 3.2.1
resolution: "axobject-query@npm:3.2.1"
Expand Down Expand Up @@ -19369,6 +19384,16 @@ __metadata:
languageName: node
linkType: hard

"dria@npm:^0.0.3":
version: 0.0.3
resolution: "dria@npm:0.0.3"
dependencies:
axios: ^1.6.5
zod: ^3.22.4
checksum: 69d66479cb015e87425fba7f1741e4d895b4f43844e6b8897d3e7fe38e579097f2c4673d534141419d15a72866adf4db12acb9b59b42681ef2c4ee2d301b9267
languageName: node
linkType: hard

"duck@npm:^0.1.12":
version: 0.1.12
resolution: "duck@npm:0.1.12"
Expand Down Expand Up @@ -21593,6 +21618,16 @@ __metadata:
languageName: node
linkType: hard

"follow-redirects@npm:^1.15.4":
version: 1.15.5
resolution: "follow-redirects@npm:1.15.5"
peerDependenciesMeta:
debug:
optional: true
checksum: 5ca49b5ce6f44338cbfc3546823357e7a70813cecc9b7b768158a1d32c1e62e7407c944402a918ea8c38ae2e78266312d617dc68783fac502cbb55e1047b34ec
languageName: node
linkType: hard

"for-each@npm:^0.3.3":
version: 0.3.3
resolution: "for-each@npm:0.3.3"
Expand Down
Loading