forked from langchain-ai/langchain
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebase from main. #1
Merged
aayush3011
merged 355 commits into
users/garagundi/cosmosdbnosql
from
users/akataria/rebase
Aug 7, 2024
Merged
Rebase from main. #1
aayush3011
merged 355 commits into
users/garagundi/cosmosdbnosql
from
users/akataria/rebase
Aug 7, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
add dynamic field feature to langchain_milvus more unittest, more robustic plan to deprecate the `metadata_field` in the future, because it's function is the same as `enable_dynamic_field`, but the latter one is a more advanced concept in milvus Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev>
- **cli: remove snapshot flag from pytest defaults** - **x** - **x**
…ngchain-ai#24499) - [ ] **PR title**: "experimental: Adding compatibility for OllamaFunctions with ImagePromptTemplate" - [ ] **PR message**: - **Description:** Removes the outdated `_convert_messages_to_ollama_messages` method override in the `OllamaFunctions` class to ensure that ollama multimodal models can be invoked with an image. - **Issue:** langchain-ai#24174 --------- Co-authored-by: Joel Akeret <joel.akeret@ti&m.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
**Issue:** now the [ChatGooglePalm](https://python.langchain.com/v0.2/docs/integrations/vectorstores/scann/#retrievalqa-demo) class is not parsed and do not presented in the "API Reference:" line. **PR:** [Fixed it](https://langchain-7n5k5wkfs-langchain.vercel.app/v0.2/docs/integrations/vectorstores/scann/#retrievalqa-demo) by properly importing.
…angchain-ai#22779) #### Update (2): A single `UnstructuredLoader` is added to handle both local and api partitioning. This loader also handles single or multiple documents. #### Changes in `community`: Changes here do not affect users. In the initial process of using the SDK for the API Loaders, the Loaders in community were refactored. Other changes include: The `UnstructuredBaseLoader` has a new check to see if both `mode="paged"` and `chunking_strategy="by_page"`. It also now has `Element.element_id` added to the `Document.metadata`. `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such, now both directly inherit from `UnstructuredBaseLoader` and initialize their `file_path`/`file` attributes respectively and implement their own `_post_process_elements` methods. -------- #### Update: New SDK Loaders in a [partner package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo) are introduced to prevent breaking changes for users (see discussion below). ##### TODO: - [x] Test docstring examples -------- - **Description:** UnstructuredAPIFileIOLoader and UnstructuredAPIFileLoader calls to the unstructured api are now made using the unstructured-client sdk. - **New Dependencies:** unstructured-client - [x] **Add tests and docs**: If you're adding a new integration, please include - [x] a test for the integration, preferably unit tests that do not rely on network access, - [x] update the description in `docs/docs/integrations/providers/unstructured.mdx` - [x] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. TODO: - [x] Update https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api - `langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb` - The description here needs to indicate that users should install `unstructured-client` instead of `unstructured`. Read over closely to look for any other changes that need to be made. - [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to handle json responses from the API instead of just lists of elements. - This method may need to be overwritten by the API loaders instead of changing it in the `UnstructuredBaseLoader`. - [x] Update the documentation links in the class docstrings (the Unstructured documents have moved) - [x] Update Document.metadata to include `element_id` (see thread [here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419)) --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com>
…hain-ai#24514) Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration. ScrapingAnt is a web scraping API that allows extracting web page data into accessible and well-formatted markdown. Description: Added ScrapingAnt web loader for retrieving web page data as markdown Dependencies: scrapingant-client Twitter: @WeRunTheWorld3 --------- Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>
This PR introduces the following Runnables: 1. BaseRateLimiter: an abstraction for specifying a time based rate limiter as a Runnable 2. InMemoryRateLimiter: Provides an in-memory implementation of a rate limiter ## Example ```python from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda from datetime import datetime foo = InMemoryRateLimiter(requests_per_second=0.5) def meow(x): print(datetime.now().strftime("%H:%M:%S.%f")) return x chain = foo | meow for _ in range(10): print(chain.invoke('hello')) ``` Produces: ``` 17:12:07.530151 hello 17:12:09.537932 hello 17:12:11.548375 hello 17:12:13.558383 hello 17:12:15.568348 hello 17:12:17.578171 hello 17:12:19.587508 hello 17:12:21.597877 hello 17:12:23.607707 hello 17:12:25.617978 hello ``` ![image](https://github.com/user-attachments/assets/283af59f-e1e1-408b-8e75-d3910c3c44cc) ## Interface The rate limiter uses the following interface for acquiring a token: ```python class BaseRateLimiter(Runnable[Input, Output], abc.ABC): @abc.abstractmethod def acquire(self, *, blocking: bool = True) -> bool: """Attempt to acquire the necessary tokens for the rate limiter.``` ``` The flag `blocking` has been added to the abstraction to allow supporting streaming (which is easier if blocking=False). ## Limitations - The rate limiter is not designed to work across different processes. It is an in-memory rate limiter, but it is thread safe. - The rate limiter only supports time-based rate limiting. It does not take into account the size of the request or any other factors. - The current implementation does not handle streaming inputs well and will consume all inputs even if the rate limit has been reached. Better support for streaming inputs will be added in the future. - When the rate limiter is combined with another runnable via a RunnableSequence, usage of .batch() or .abatch() will only respect the average rate limit. There will be bursty behavior as .batch() and .abatch() wait for each step to complete before starting the next step. One way to mitigate this is to use batch_as_completed() or abatch_as_completed(). ## Bursty behavior in `batch` and `abatch` When the rate limiter is combined with another runnable via a RunnableSequence, usage of .batch() or .abatch() will only respect the average rate limit. There will be bursty behavior as .batch() and .abatch() wait for each step to complete before starting the next step. This becomes a problem if users are using `batch` and `abatch` with many inputs (e.g., 100). In this case, there will be a burst of 100 inputs into the batch of the rate limited runnable. 1. Using a RunnableBinding The API would look like: ```python from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda rate_limiter = InMemoryRateLimiter(requests_per_second=0.5) def meow(x): return x rate_limited_meow = RunnableLambda(meow).with_rate_limiter(rate_limiter) ``` 2. Another option is to add some init option to RunnableSequence that changes `.batch()` to be depth first (e.g., by delegating to `batch_as_completed`) ```python RunnableSequence(first=rate_limiter, last=model, how='batch-depth-first') ``` Pros: Does not require Runnable Binding Cons: Feels over-complicated
… beta (langchain-ai#24667) Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Description: According to this page: https://python.langchain.com/v0.2/docs/integrations/chat/ollama_functions/ ChatOllama does support Tool Calling. Issue: The documentation is incorrect Dependencies: None Twitter handle: NA
Updated notebook for tool calling support in chat models
…i#24472) ### Description * support asynchronous in InMemoryVectorStore * since embeddings might be possible to call asynchronously, ensure that both asynchronous and synchronous functions operate correctly.
…Ms (langchain-ai#24068) Thank you for contributing to LangChain! **Description:** This PR allows users of `langchain_community.llms.ollama.Ollama` to specify the `auth` parameter, which is then forwarded to all internal calls of `requests.request`. This works in the same way as the existing `headers` parameters. The auth parameter enables the usage of the given class with Ollama instances, which are secured by more complex authentication mechanisms, that do not only rely on static headers. An example are AWS API Gateways secured by the IAM authorizer, which expects signatures dynamically calculated on the specific HTTP request. **Issue:** Integrating a remote LLM running through Ollama using `langchain_community.llms.ollama.Ollama` only allows setting static HTTP headers with the parameter `headers`. This does not work, if the given instance of Ollama is secured with an authentication mechanism that makes use of dynamically created HTTP headers which for example may depend on the content of a given request. **Dependencies:** None **Twitter handle:** None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- **Description:** Standardize BaichuanTextEmbeddings docstrings. - **Issue:** the issue langchain-ai#21983
…angchain-ai#24668) Mistral appears to have added validation for the format of its tool call IDs: `{"object":"error","message":"Tool call id was abc123 but must be a-z, A-Z, 0-9, with a length of 9.","type":"invalid_request_error","param":null,"code":null}` This breaks compatibility of messages from other providers. Here we add a function that converts any string to a Mistral-valid tool call ID, and apply it to incoming messages.
In some cases tool calls are mutated when passed through a tool.
**Issue:** Several packages are not referenced in the `providers` pages. **Fix:** Added the missed references. Fixed the notebook formatting.
Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Fixes for Eden AI Custom tools and ChatEdenAI: - add missing import in __init__ of chat_models - add `args_schema` to custom tools. otherwise '__arg1' would sometimes be passed to the `run` method - fix IndexError when no human msg is added in ChatEdenAI
Part of langchain-ai#22296 --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
part of langchain-ai#22296 --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
…chain-ai#25096) - community: Allow authorization to Confluence with bearer token - **Description:** Allow authorization to Confluence with [Personal Access Token](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html) by checking for the keys `['client_id', token: ['access_token', 'token_type']]` - **Issue:** Currently the following error occurs when using an personal access token for authorization. ```python loader = ConfluenceLoader( url=os.getenv('CONFLUENCE_URL'), oauth2={ 'token': {"access_token": os.getenv("CONFLUENCE_ACCESS_TOKEN"), "token_type": "bearer"}, 'client_id': 'client_id', }, page_ids=['12345678'], ) ``` ``` ValueError: Error(s) while validating input: ["You have either omitted require keys or added extra keys to the oauth2 dictionary. key values should be `['access_token', 'access_token_secret', 'consumer_key', 'key_cert']`"] ``` With this PR the loader runs as expected. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
`python -m langchain_core.sys_info` ```bash System Information ------------------ > OS: Linux > OS Version: langchain-ai#44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Jun 18 14:36:16 UTC 2 > Python Version: 3.11.4 (main, Sep 25 2023, 10:06:23) [GCC 11.4.0] Package Information ------------------- > langchain_core: 0.2.28 > langchain: 0.2.8 > langsmith: 0.1.85 > langchain_anthropic: 0.1.20 > langchain_openai: 0.1.20 > langchain_standard_tests: 0.1.1 > langchain_text_splitters: 0.2.2 > langgraph: 0.1.19 Optional packages not installed ------------------------------- > langserve Other Dependencies ------------------ > aiohttp: 3.9.5 > anthropic: 0.31.1 > async-timeout: Installed. No version info available. > defusedxml: 0.7.1 > httpx: 0.27.0 > jsonpatch: 1.33 > numpy: 1.26.4 > openai: 1.39.0 > orjson: 3.10.6 > packaging: 24.1 > pydantic: 2.8.2 > pytest: 7.4.4 > PyYAML: 6.0.1 > requests: 2.32.3 > SQLAlchemy: 2.0.31 > tenacity: 8.5.0 > tiktoken: 0.7.0 > typing-extensions: 4.12.2 ```
langchain-ai#24915) - description: I remove the limitation of mandatory existence of `QIANFAN_AK` and default model name which langchain uses cause there is already a default model nama underlying `qianfan` SDK powering langchain component. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:** Standardize Tongyi LLM,include: - docs, the issue langchain-ai#24803 - model init arg names, the issue langchain-ai#20085
Co-authored-by: Erick Friis <erick@langchain.dev>
…allbackHandler (langchain-ai#25104) Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Example: "community: Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Naval Chand <navalchand@192.168.1.36>
…n-ai#25025) **Description:** In this PR, I am adding three stock market tools from financialdatasets.ai (my API!): - get balance sheets - get cash flow statements - get income statements Twitter handle: [@virattt](https://twitter.com/virattt) --------- Co-authored-by: Erick Friis <erick@langchain.dev>
…n-ai#25100) Support document index in the index api.
Among integration packages in libs/partners, Groq is an exception in that it errors on warnings. Following langchain-ai#25084, Groq fails with > pydantic.warnings.PydanticDeprecatedSince20: The `__fields__` attribute is deprecated, use `model_fields` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. Here we update the behavior to no longer fail on warning, which is consistent with the rest of the packages in libs/partners.
This PR does an aesthetic sort of the config object attributes. This will make it a bit easier to go back and forth between pydantic v1 and pydantic v2 on the 0.3.x branch
Fix word spelling error
updated with langchain_google_community instead as the latest revision Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
…gpt4all_kwargs (langchain-ai#25124) - **Description:** Instantiating `GPT4AllEmbeddings` with no `gpt4all_kwargs` argument raised a `ValidationError`. Root cause: langchain-ai#21238 added the capability to pass `gpt4all_kwargs` through to the `GPT4All` instance via `Embed4All`, but broke code that did not specify a `gpt4all_kwargs` argument. - **Issue:** langchain-ai#25119 - **Dependencies:** None - **Twitter handle:** [`@metadaddy`](https://twitter.com/metadaddy)
Thank you for contributing to LangChain! - [ ] **PR title**: "Documentation Update : Semantic Caching Update for Upstash" - Docs, llm caching integrations update - **Description:** Upstash supports semantic caching, and we would like to inform you about this - **Twitter handle:** You can mention eray_eroglu_ if you want to post a tweet about the PR --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
…n-ai#25140) Relax rate limit unit tests
- **Description:** Standardize QianfanLLMEndpoint LLM,include: - docs, the issue langchain-ai#24803 - model init arg names, the issue langchain-ai#20085
Just changing gpt-3.5 to gpt-4o-mini . That's what's used in the code examples now. It just didn't get updated in the main text.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thank you for contributing to LangChain!
PR title: "package: description"
PR message: Delete this entire checklist and replace with
Add tests and docs: If you're adding a new integration, please include
docs/docs/integrations
directory.Lint and test: Run
make format
,make lint
andmake test
from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/Additional guidelines:
If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.