-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
community[minor]: Add Dria retriever #4302
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
--- | ||
hide_table_of_contents: true | ||
--- | ||
|
||
# Dria Retriever | ||
|
||
The [Dria](https://dria.co/profile) retriever allows an agent to perform a text-based search across a comprehensive knowledge hub. | ||
|
||
## Setup | ||
|
||
To use Dria retriever, first install Dria JS client: | ||
|
||
```bash npm2yarn | ||
npm install dria | ||
``` | ||
|
||
You need to provide two things to the retriever: | ||
|
||
- **API Key**: you can get yours at your [profile page](https://dria.co/profile) when you create an account. | ||
- **Contract ID**: accessible at the top of the page when viewing a knowledge or in its URL. | ||
For example, the Bitcoin whitepaper is uploaded on Dria at https://dria.co/knowledge/2KxNbEb040GKQ1DSDNDsA-Fsj_BlQIEAlzBNuiapBR0, so its contract ID is `2KxNbEb040GKQ1DSDNDsA-Fsj_BlQIEAlzBNuiapBR0`. | ||
Contract ID can be omitted during instantiation, and later be set via `dria.contractId = "your-contract"` | ||
|
||
Dria retriever exposes the underlying [Dria client](https://npmjs.com/package/dria) as well, refer to the [Dria documentation](https://github.com/firstbatchxyz/dria-js-client?tab=readme-ov-file#usage) to learn more about the client. | ||
|
||
## Usage | ||
|
||
import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx"; | ||
|
||
<IntegrationInstallTooltip></IntegrationInstallTooltip> | ||
|
||
```bash npm2yarn | ||
npm install dria @langchain/community | ||
``` | ||
|
||
import CodeBlock from "@theme/CodeBlock"; | ||
import Example from "@examples/retrievers/dria.ts"; | ||
|
||
<CodeBlock language="typescript">{Example}</CodeBlock> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
import { DriaRetriever } from "@langchain/community/retrievers/dria"; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great work on the PR! I've flagged this change for review as it appears to explicitly access an environment variable via |
||
|
||
// contract of TypeScript Handbook v4.9 uploaded to Dria | ||
// https://dria.co/knowledge/-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0 | ||
const contractId = "-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0"; | ||
|
||
const retriever = new DriaRetriever({ | ||
contractId, // a knowledge to connect to | ||
apiKey: "DRIA_API_KEY", // if not provided, will check env for `DRIA_API_KEY` | ||
topK: 15, // optional: default value is 10 | ||
}); | ||
|
||
const docs = await retriever.getRelevantDocuments("What is a union type?"); | ||
console.log(docs); |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -121,6 +121,7 @@ | |
"discord.js": "^14.14.1", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey there! I noticed that the latest PR adds a new dependency "dria" to the package.json file. This change is flagged for maintainers to review the addition of this regular dependency. Great work, and looking forward to the review! |
||
"dotenv": "^16.0.3", | ||
"dpdm": "^3.12.0", | ||
"dria": "^0.0.3", | ||
"eslint": "^8.33.0", | ||
"eslint-config-airbnb-base": "^15.0.0", | ||
"eslint-config-prettier": "^8.6.0", | ||
|
@@ -218,6 +219,7 @@ | |
"cohere-ai": "*", | ||
"convex": "^1.3.1", | ||
"discord.js": "^14.14.1", | ||
"dria": "^0.0.3", | ||
"faiss-node": "^0.5.1", | ||
"firebase-admin": "^11.9.0", | ||
"google-auth-library": "^8.9.0", | ||
|
@@ -405,6 +407,9 @@ | |
"discord.js": { | ||
"optional": true | ||
}, | ||
"dria": { | ||
"optional": true | ||
}, | ||
"faiss-node": { | ||
"optional": true | ||
}, | ||
|
@@ -1640,6 +1645,15 @@ | |
"import": "./retrievers/databerry.js", | ||
"require": "./retrievers/databerry.cjs" | ||
}, | ||
"./retrievers/dria": { | ||
"types": { | ||
"import": "./retrievers/dria.d.ts", | ||
"require": "./retrievers/dria.d.cts", | ||
"default": "./retrievers/dria.d.ts" | ||
}, | ||
"import": "./retrievers/dria.js", | ||
"require": "./retrievers/dria.cjs" | ||
}, | ||
"./retrievers/metal": { | ||
"types": { | ||
"import": "./retrievers/metal.d.ts", | ||
|
@@ -2539,6 +2553,10 @@ | |
"retrievers/databerry.js", | ||
"retrievers/databerry.d.ts", | ||
"retrievers/databerry.d.cts", | ||
"retrievers/dria.cjs", | ||
"retrievers/dria.js", | ||
"retrievers/dria.d.ts", | ||
"retrievers/dria.d.cts", | ||
"retrievers/metal.cjs", | ||
"retrievers/metal.js", | ||
"retrievers/metal.d.ts", | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey team, just a heads up that I've flagged this PR for review because it introduces code that explicitly adds, accesses, reads, and requires an environment variable via |
||
BaseRetriever, | ||
type BaseRetrieverInput, | ||
} from "@langchain/core/retrievers"; | ||
import { Document } from "@langchain/core/documents"; | ||
import { getEnvironmentVariable } from "@langchain/core/utils/env"; | ||
import type { DriaParams, SearchOptions as DriaSearchOptions } from "dria"; | ||
import { Dria } from "dria"; | ||
|
||
/** | ||
* Configurations for Dria retriever. | ||
* | ||
* - `contractId`: a Dria knowledge's contract ID. | ||
* - `apiKey`: a Dria API key; if omitted, the retriever will check for `DRIA_API_KEY` environment variable. | ||
* | ||
* The retrieval can be configured with the following options: | ||
* | ||
* - `topK`: number of results to return, max 20. (default: 10) | ||
* - `rerank`: re-rank the results from most to least semantically relevant to the given search query. (default: true) | ||
* - `level`: level of detail for the search, must be an integer from 0 to 5 (inclusive). (default: 1) | ||
* - `field`: CSV field name, only relevant for the CSV files. | ||
*/ | ||
export interface DriaRetrieverArgs | ||
extends DriaParams, | ||
BaseRetrieverInput, | ||
DriaSearchOptions {} | ||
|
||
/** | ||
* Class for retrieving documents from knowledge uploaded to Dria. | ||
* | ||
* @example | ||
* ```typescript | ||
* // contract of TypeScript Handbook v4.9 uploaded to Dria | ||
* const contractId = "-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0"; | ||
* const retriever = new DriaRetriever({ contractId }); | ||
* | ||
* const docs = await retriever.getRelevantDocuments("What is a union type?"); | ||
* console.log(docs); | ||
* ``` | ||
*/ | ||
export class DriaRetriever extends BaseRetriever { | ||
static lc_name() { | ||
return "DriaRetriever"; | ||
} | ||
|
||
lc_namespace = ["langchain", "retrievers", "dria"]; | ||
|
||
get lc_secrets() { | ||
return { apiKey: "DRIA_API_KEY" }; | ||
} | ||
|
||
get lc_aliases() { | ||
return { apiKey: "api_key" }; | ||
} | ||
|
||
apiKey: string; | ||
|
||
public driaClient: Dria; | ||
|
||
private searchOptions: DriaSearchOptions; | ||
|
||
constructor(fields: DriaRetrieverArgs) { | ||
super(fields); | ||
|
||
const apiKey = fields.apiKey ?? getEnvironmentVariable("DRIA_API_KEY"); | ||
if (!apiKey) throw new Error("Missing DRIA_API_KEY."); | ||
this.apiKey = apiKey; | ||
|
||
this.searchOptions = { | ||
topK: fields.topK, | ||
field: fields.field, | ||
rerank: fields.rerank, | ||
level: fields.level, | ||
}; | ||
|
||
this.driaClient = new Dria({ | ||
contractId: fields.contractId, | ||
apiKey: this.apiKey, | ||
}); | ||
} | ||
|
||
/** | ||
* Currently connected knowledge on Dria. | ||
* | ||
* Retriever will use this contract ID while retrieving documents, | ||
* and will throw an error if `undefined`. | ||
* | ||
* In the case that this is `undefined`, the user is expected to | ||
* set contract ID manually, such as after creating a new knowledge & inserting | ||
* data there with the Dria client. | ||
*/ | ||
get contractId(): string | undefined { | ||
return this.driaClient.contractId; | ||
} | ||
|
||
set contractId(value: string) { | ||
this.driaClient.contractId = value; | ||
} | ||
|
||
/** | ||
* Retrieves documents from Dria with respect to the configured contract ID, based on | ||
* the given query string. | ||
* | ||
* @param query The query string | ||
* @returns A promise that resolves to an array of documents, with page content as text, | ||
* along with `id` and the relevance `score` within the metadata. | ||
*/ | ||
async _getRelevantDocuments(query: string): Promise<Document[]> { | ||
const docs = await this.driaClient.search(query, this.searchOptions); | ||
return docs.map( | ||
(d) => | ||
new Document({ | ||
// dria.search returns a string within the metadata as the content | ||
pageContent: d.metadata, | ||
metadata: { | ||
id: d.id, | ||
score: d.score, | ||
}, | ||
}) | ||
); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
import { test, expect } from "@jest/globals"; | ||
import { DriaRetriever } from "../dria.js"; | ||
|
||
test.skip("DriaRetriever", async () => { | ||
// contract of TypeScript Handbook v4.9 uploaded to Dria | ||
// https://dria.co/knowledge/-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0 | ||
const contractId = "-B64DjhUtCwBdXSpsRytlRQCu-bie-vSTvTIT8Ap3g0"; | ||
const topK = 10; | ||
|
||
const retriever = new DriaRetriever({ contractId, topK }); | ||
|
||
const docs = await retriever.getRelevantDocuments("What is a union type?"); | ||
expect(docs.length).toBe(topK); | ||
|
||
console.log(docs[0].pageContent); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there! I've reviewed the code and noticed that the new changes introduce a net-new external HTTP request using fetch or axios when calling the
getRelevantDocuments
method. I've flagged this for your review to ensure it aligns with the project's requirements. Let me know if you have any questions or need further clarification!