fix: provider the latest duckduckgo_search API #5184

Undertone0809 · 2023-05-24T09:10:21Z

Provider the latest duckduckgo_search API

The Git commit contents involve two files related to some DuckDuckGo query operations, and an upgrade of the DuckDuckGo module to version 3.2.0. A suitable commit message could be "Upgrade DuckDuckGo module to version 3.2.0, including query operations". Specifically, in the duckduckgo_search.py file, a DDGS() class instance is newly added to replace the previous ddg() function, and the time parameter name in the get_snippets() and results() methods is changed from "time" to "timelimit" to accommodate recent changes. In the pyproject.toml file, the duckduckgo-search module is upgraded to version 3.2.0.

Who can review?

@vowelparrot

Undertone0809 · 2023-05-24T09:14:12Z

duckduckgo_search readme attention: Versions before v2.9.4 no longer work as of May 12, 2023

Undertone0809 · 2023-05-24T09:17:58Z

@vowelparrot I wonder why duckduckgo_search has release v3.2.0 but checks failed and said Because langchain depends on duckduckgo-search (^3.2.0) which doesn't match any versions, version solving failed.

vowelparrot · 2023-05-24T11:43:37Z

The poetry.lock file needs to be updated as well! (running poetry lock will update)

Undertone0809 · 2023-05-24T12:57:46Z

@vowelparrot I got the following problem. Do you have ideas to solve the problem?

And because weaviate-client (3.19.0) depends on requests (>=2.28.0,<2.29.0)
 and weaviate-client (3.19.1) depends on requests (>=2.28.0,<2.29.0), weaviate-client (>=3.15.5,<4) requires requests (>=2.28.0,<2.29.0).
And because duckduckgo-search (3.2.0) depends on requests (>=2.31.0)
 and no versions of duckduckgo-search match >3.2.0,<4.0.0, weaviate-client (>=3.15.5,<4) is incompatible with duckduckgo-search (>=3.2.0,<4.0.0).
So, because langchain depends on both weaviate-client (>=3.15.5,<4) and duckduckgo-search (^3.2.0), version solving failed.

vowelparrot · 2023-05-24T13:26:20Z

I'll try to get back to this later - opened a PR on the weaviate client to bump the upper bound. the Dep resolution issue seems fairly common : python-poetry/poetry#697

Undertone0809 · 2023-05-24T13:30:29Z

Oh I see. It's a very common problem.

rrcgat · 2023-06-15T12:32:42Z

We can harness the self.max_results to limit the search results, for example:

    for i, res in enumerate(results, 1):
        snippets.append(res['body'])
        if i == self.max_results:
            break
    return snippets

Generator is much simple:

...
    for i, res in enumerate(results, 1):
        yield res['body']
        if i == self.max_results:
            return
...

In the latest version of duckduckgo_search, it is recommand to use context manager with DDGS:

with DDGS() as ddgs:
    ddgs.text(...)

Undertone0809 · 2023-06-15T12:42:54Z

ok I fix it later. Previously, due to dependency version conflicts, I was unable to pass the test.

# Fix for docstring in faiss.py vectorstore (load_local) The doctring should reflect that load_local loads something FROM the disk.

…-ai#5449) This removes duplicate code presumably introduced by a cut-and-paste error, spotted while reviewing the code in ```langchain/client/langchain.py```. The original code had back to back occurrences of the following code block: ``` response = self._get( path, params=params, ) raise_for_status_with_text(response) ```

# What does this PR do? Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`, change the docstring example and add a test to illustrate with `normalize_embeddings`. Fixes langchain-ai#3605 (Similar to langchain-ai#3914) Use case: ```python from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs ) ```

@hwchase17

…#5432) # Handles the edge scenario in which the action input is a well formed SQL query which ends with a quoted column There may be a cleaner option here (or indeed other edge scenarios) but this seems to robustly determine if the action input is likely to be a well formed SQL query in which we don't want to arbitrarily trim off `"` characters Fixes langchain-ai#5423 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Agents / Tools / Toolkits - @vowelparrot

@hwchase17

# docs cleaning Changed docs to consistent format (probably, we need an official doc integration template): - ClearML - added product descriptions; changed title/headers - Rebuff - added product descriptions; changed title/headers - WhyLabs - added product descriptions; changed title/headers - Docugami - changed title/headers/structure - Airbyte - fixed title - Wolfram Alpha - added descriptions, fixed title - OpenWeatherMap - - added product descriptions; changed title/headers - Unstructured - changed description ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @dev2049

@hwchase17

# Added Async _acall to FakeListLLM FakeListLLM is handy when unit testing apps built with langchain. This allows the use of FakeListLLM inside concurrent code with [asyncio](https://docs.python.org/3/library/asyncio.html). I also changed the pydocstring which was out of date. ## Who can review? @hwchase17 - project lead @agola11 - async

# Add batching to Qdrant Several people requested a batching mechanism while uploading data to Qdrant. It is important, as there are some limits for the maximum size of the request payload, and without batching implemented in Langchain, users need to implement it on their own. This PR exposes a new optional `batch_size` parameter, so all the documents/texts are loaded in batches of the expected size (64, by default). The integration tests of Qdrant are extended to cover two cases: 1. Documents are sent in separate batches. 2. All the documents are sent in a single request.

Update [psychicapi](https://pypi.org/project/psychicapi/) python package dependency to the latest version 0.5. The newest python package version addresses breaking changes in the Psychic http api.

# Add maximal relevance search to SKLearnVectorStore This PR implements the maximum relevance search in SKLearnVectorStore. Twitter handle: jtolgyesi (I submitted also the original implementation of SKLearnVectorStore) ## Before submitting Unit tests are included. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

…loader (langchain-ai#5466) # Adds ability to specify credentials when using Google BigQuery as a data loader Fixes langchain-ai#5465 . Adds ability to set credentials which must be of the `google.auth.credentials.Credentials` type. This argument is optional and will default to `None. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

…the class BooleanOutputParser (langchain-ai#5397) when the LLMs output 'yes|no'，BooleanOutputParser can parse it to 'True|False', fix the ValueError in parse().  Fixes # (issue) langchain-ai#5396 langchain-ai#5396 --------- Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>

# Added support for modifying the number of threads in the GPT4All model I have added the capability to modify the number of threads used by the GPT4All model. This allows users to adjust the model's parallel processing capabilities based on their specific requirements. ## Changes Made - Updated the `validate_environment` method to set the number of threads for the GPT4All model using the `values["n_threads"]` parameter from the `GPT4All` class constructor. ## Context Useful in scenarios where users want to optimize the model's performance by leveraging multi-threading capabilities. Please note that the `n_threads` parameter was included in the `GPT4All` class constructor but was previously unused. This change ensures that the specified number of threads is utilized by the model . ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change testing is not required. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

# Allow for async use of SelfAskWithSearchChain Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

…bject (langchain-ai#5321) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # langchain-ai#5239 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

# SQLite-backed Entity Memory Following the initiative of langchain-ai#2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

Co-authored-by: David Revillas <26328973+r3v1@users.noreply.github.com>

@dev2049

# Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

Example (would log several times if not for the helper fn. Would emit no logs due to mulithreading previously) ![image](https://github.com/hwchase17/langchain/assets/130414180/070d25ae-1f06-4487-9617-0a6f66f3f01e)

@eyurtsev

# Introduces embaas document extraction api endpoints In this PR, we add support for embaas document extraction endpoints to Text Embedding Models (with LLMs, in different PRs coming). We currently offer the MTEB leaderboard top performers, will continue to add top embedding models and soon add support for customers to deploy thier own models. Additional Documentation + Infomation can be found [here](https://embaas.io). While developing this integration, I closely followed the patterns established by other langchain integrations. Nonetheless, if there are any aspects that require adjustments or if there's a better way to present a new integration, let me know! :) Additionally, I fixed some docs in the embeddings integration. Related PR: langchain-ai#5976 #### Who can review? DataLoaders - @eyurtsev

Update the Run object in the tracer to extend that in the SDK to include the parameters necessary for tracking/tracing

…gchain-ai#5833 (langchain-ai#6077) This PR fixes the error `ModuleNotFoundError: No module named 'langchain.cli'` Fixes langchain-ai#5833 (issue)

…in-ai#6069) Add test and update notebook for `MarkdownHeaderTextSplitter`.

@dev2049

langchain-ai#6056) This adds implementation of MMR search in pinecone; and I have two semi-related observations about this vector store class: - Maybe we should also have a `similarity_search_by_vector_returning_embeddings` like in supabase, but it's not in the base `VectorStore` class so I didn't implement - Talking about the base class, there's `similarity_search_with_relevance_scores`, but in pinecone it is called `similarity_search_with_score`; maybe we should consider renaming it to align with other `VectorStore` base and sub classes (or add that as an alias for backward compatibility) #### Who can review? Tag maintainers/contributors who might be interested: - VectorStores / Retrievers / Memory - @dev2049

@hwchase17

Fixes # (issue) #### Before submitting  #### Who can review? Tag maintainers/contributors who might be interested:

Makes it easier to then run evals w/o thinking about specifying a session

Add a callback handler that can collect nested run objects. Useful for evaluation.

Minor fix in documentation. Change URL in wget call to proper one.

Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>

@hwchase17

Fixes # (issue) #### Before submitting  #### Who can review? Tag maintainers/contributors who might be interested:

@hwchase17

## Add Solidity programming language support for code splitter. Twitter: @0xjord4n_  #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17

@eyurtsev

…ai#5922) Confluence API supports difference format of page content. The storage format is the raw XML representation for storage. The view format is the HTML representation for viewing with macros rendered as though it is viewed by users. Add the `content_format` parameter to `ConfluenceLoader.load()` to specify the content format, this is set to `ContentFormat.STORAGE` by default. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>

vercel · 2023-06-19T03:34:03Z

@Undertone0809 is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

Undertone0809 · 2023-06-19T06:14:10Z

Oh, I don't know why make so many commit. Perhaps I should create a new PR #6409

vowelparrot mentioned this pull request May 24, 2023

Increase Requests Upper Bound weaviate/weaviate-python-client#341

Closed

dev2049 assigned vowelparrot May 24, 2023

vowelparrot mentioned this pull request May 30, 2023

DuckDuckGo search always returns "No good DuckDuckGo Search Result was found" #5435

Closed

14 tasks

luckyduck and others added 19 commits June 19, 2023 11:21

Fixed docstring in faiss.py for load_local (langchain-ai#5440)

3808689

# Fix for docstring in faiss.py vectorstore (load_local) The doctring should reflect that load_local loads something FROM the disk.

Update psychicapi version (langchain-ai#5471)

f30b403

Update [psychicapi](https://pypi.org/project/psychicapi/) python package dependency to the latest version 0.5. The newest python package version addresses breaking changes in the Psychic http api.

add simple test for imports (langchain-ai#5461)

05771cb

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

Allow for async use of SelfAskWithSearchChain (langchain-ai#5394)

7f85bf6

# Allow for async use of SelfAskWithSearchChain Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

SQLite-backed Entity Memory (langchain-ai#5129)

b08a832

# SQLite-backed Entity Memory Following the initiative of langchain-ai#2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

py tracer fixes (langchain-ai#5377)

3599b2e

Harrison/html splitter (langchain-ai#5468)

42b3024

Co-authored-by: David Revillas <26328973+r3v1@users.noreply.github.com>

vowelparrot and others added 27 commits June 19, 2023 11:22

Log tracer errors (langchain-ai#6066)

5c37aac

Example (would log several times if not for the helper fn. Would emit no logs due to mulithreading previously) ![image](https://github.com/hwchase17/langchain/assets/130414180/070d25ae-1f06-4487-9617-0a6f66f3f01e)

improve tools (langchain-ai#6062)

7f04c96

propogate kwargs fully (langchain-ai#6076)

cfbc2c0

Enable serialization for anthropic (langchain-ai#6049)

fe37360

turn off repr (langchain-ai#6078)

f4cdbc5

Use Run object from SDK (langchain-ai#6067)

bf7e800

Update the Run object in the tracer to extend that in the SDK to include the parameters necessary for tracking/tracing

Fix for ModuleNotFoundError while running langchain-server. Issue lan…

b115877

…gchain-ai#5833 (langchain-ai#6077) This PR fixes the error `ModuleNotFoundError: No module named 'langchain.cli'` Fixes langchain-ai#5833 (issue)

Add tests and update notebook for MarkdownHeaderTextSplitter (langcha…

88102fc

…in-ai#6069) Add test and update notebook for `MarkdownHeaderTextSplitter`.

support functions (langchain-ai#6099)

d56ebda

convert tools to openai (langchain-ai#6100)

af9fa13

bump version to 199 (langchain-ai#6102)

1d627fe

Harrison/notebook functions (langchain-ai#6103)

843343a

support streaming for functions (langchain-ai#6115)

63962cc

Return session name in runner response (langchain-ai#6112)

e01f475

Makes it easier to then run evals w/o thinking about specifying a session

add functions agent (langchain-ai#6113)

4b3e194

bump ver to 200 (langchain-ai#6130)

0ea2cb1

Add Run Collector Callback (langchain-ai#6133)

d36d478

Add a callback handler that can collect nested run objects. Useful for evaluation.

Update Name (langchain-ai#6136)

9946f75

Update readthedocs_documentation.ipynb (langchain-ai#6148)

f3e3394

Minor fix in documentation. Change URL in wget call to proper one.

typo: 'following following' to 'following' (langchain-ai#6163)

f10cdfb

Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>

feat: use latest duckduckgo_search API to call

e2920a1

Undertone0809 closed this Jun 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: provider the latest duckduckgo_search API #5184

fix: provider the latest duckduckgo_search API #5184

Undertone0809 commented May 24, 2023

Undertone0809 commented May 24, 2023

Undertone0809 commented May 24, 2023

vowelparrot commented May 24, 2023

Undertone0809 commented May 24, 2023

vowelparrot commented May 24, 2023

Undertone0809 commented May 24, 2023

rrcgat commented Jun 15, 2023 •

edited

Loading

Undertone0809 commented Jun 15, 2023

vercel bot commented Jun 19, 2023

Undertone0809 commented Jun 19, 2023

fix: provider the latest duckduckgo_search API #5184

fix: provider the latest duckduckgo_search API #5184

Conversation

Undertone0809 commented May 24, 2023

Provider the latest duckduckgo_search API

Who can review?

Undertone0809 commented May 24, 2023

Undertone0809 commented May 24, 2023

vowelparrot commented May 24, 2023

Undertone0809 commented May 24, 2023

vowelparrot commented May 24, 2023

Undertone0809 commented May 24, 2023

rrcgat commented Jun 15, 2023 • edited Loading

Undertone0809 commented Jun 15, 2023

vercel bot commented Jun 19, 2023

Undertone0809 commented Jun 19, 2023

rrcgat commented Jun 15, 2023 •

edited

Loading