Skip to content

langchain_unstructured.UnstructuredLoader cannot be created when using uvloop #26294

Closed as not planned
@sfitts

Description

@sfitts

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code raises a ValueError when run in "uvicorn" configured to use uvloop:

from langchain_unstructured import UnstructuredLoader

UnstructuredLoader("any/legal/file.txt")

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/vantiqservicesdk.py", line 189, in __process_message
    result = await self.__invoke(procedure_name, params, is_system_request)
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/vantiqservicesdk.py", line 277, in __invoke
    return await func(**params)
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/aimanager/src/main/python/ai_assistant.py", line 139, in load_index_entry
    documents = await self._load_from_content(content, content_type, metadata)
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/aimanager/src/main/python/ai_assistant.py", line 166, in _load_from_content
    documents = await load_from_content(content, content_type, metadata)
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/aimanager/src/main/python/content_loader.py", line 80, in load_from_content
    loader = UnstructuredLoader(file=tmp_file, content_type=content_type, mode='paged',
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/langchain_unstructured/document_loaders.py", line 118, in __init__
    self.client = client or UnstructuredClient(
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/unstructured_client/sdk.py", line 54, in __init__
    self.sdk_configuration = SDKConfiguration(
  File "<string>", line 13, in __init__
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/unstructured_client/sdkconfiguration.py", line 38, in __post_init__
    self._hooks = SDKHooks()
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/unstructured_client/_hooks/sdkhooks.py", line 15, in __init__
    init_hooks(self)
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/unstructured_client/_hooks/registration.py", line 28, in init_hooks
    split_pdf_hook = SplitPdfHook()
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 74, in __init__
    nest_asyncio.apply()
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/nest_asyncio.py", line 19, in apply
    _patch_loop(loop)
  File "/var/lib/jenkins/workspace/ag2rs-branch-tasklist/.gradle/python/lib/python3.10/site-packages/nest_asyncio.py", line 193, in _patch_loop
    raise ValueError('Can\'t patch loop of type %s' % type(loop))
ValueError: Can't patch loop of type <class 'uvloop.Loop'>

Description

I'm trying to use the langchain UnstructuredLoader in code hosted by uvicorn on Linux (where the default loop implementation is uvloop). I expect to be able to create the loader but can't because creation of the UnstructuredClient fails.

Note that the UnstructuredClient instance is not actually needed since we are not using the API to partition.

System Info

System Information
------------------
> OS:  Windows
> OS Version:  10.0.19045
> Python Version:  3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 0.2.38
> langchain: 0.2.16
> langchain_community: 0.2.16
> langsmith: 0.1.117
> langchain_aws: 0.1.17
> langchain_elasticsearch: 0.2.2
> langchain_google_genai: 1.0.10
> langchain_huggingface: 0.0.3
> langchain_nvidia_ai_endpoints: 0.2.2
> langchain_openai: 0.1.23
> langchain_qdrant: 0.1.1
> langchain_text_splitters: 0.2.4
> langchain_unstructured: 0.1.2

Optional packages not installed
-------------------------------
> langgraph
> langserve

Other Dependencies
------------------
> aiohttp: 3.10.5
> async-timeout: Installed. No version info available.
> boto3: 1.34.162
> dataclasses-json: 0.6.7
> elasticsearch[vectorstore-mmr]: Installed. No version info available.
> google-generativeai: 0.7.2
> httpx: 0.27.2
> huggingface-hub: 0.24.6
> jsonpatch: 1.33
> numpy: 1.26.4
> openai: 1.44.0
> orjson: 3.10.7
> packaging: 24.1
> pillow: 10.4.0
> pydantic: 1.10.18
> PyYAML: 6.0.2
> qdrant-client: 1.11.1
> requests: 2.32.3
> sentence-transformers: 3.0.1
> SQLAlchemy: 2.0.34
> tenacity: 8.5.0
> tiktoken: 0.7.0
> tokenizers: 0.19.1
> transformers: 4.44.2
> typing-extensions: 4.12.2
> unstructured-client: 0.24.1
> unstructured[all-docs]: Installed. No version info available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    🤖:bugRelated to a bug, vulnerability, unexpected error with an existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions