Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VectorStore.add_texts fails with iterator #26818

Open
5 tasks done
speleo3 opened this issue Sep 24, 2024 · 3 comments
Open
5 tasks done

VectorStore.add_texts fails with iterator #26818

speleo3 opened this issue Sep 24, 2024 · 3 comments
Assignees
Labels
01 bug Confirmed bug Ɑ: core Related to langchain-core stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed Ɑ: vector store Related to vector store module

Comments

@speleo3
Copy link

speleo3 commented Sep 24, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_core.vectorstores import InMemoryVectorStore
from langchain_core.embeddings import FakeEmbeddings

vectorstore = InMemoryVectorStore(FakeEmbeddings(size=10))
ids = vectorstore.add_texts(iter(["foo", "bar"]))  # <-- use iter() here
assert len(ids) == 2

Error Message and Stack Trace (if applicable)

AssertionError

Description

The VectorStore.add_texts type annotation for text is Iterable[str]. But passing an iterator rather than a sequence is like passing an empty list.

def add_texts(
self,
texts: Iterable[str],

The fix is to replace texts with texts_ on line 104:

for text, metadata_ in zip(texts, metadatas_)

System Info

System Information

OS: Linux
OS Version: #1 SMP Fri Mar 29 23:14:13 UTC 2024
Python Version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]

Package Information

langchain_core: 0.3.1
langchain: 0.3.0
langchain_community: 0.3.0
langsmith: 0.1.121
langchain_chroma: 0.1.4
langchain_huggingface: 0.1.0
langchain_openai: 0.2.0
langchain_text_splitters: 0.3.0
langchain_unstructured: 0.1.4

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.10.5
async-timeout: 4.0.3
chromadb: 0.5.3
dataclasses-json: 0.6.7
fastapi: 0.112.4
httpx: 0.27.2
huggingface-hub: 0.24.7
jsonpatch: 1.33
numpy: 1.26.4
openai: 1.46.0
orjson: 3.10.7
packaging: 23.2
pydantic: 2.9.2
pydantic-settings: 2.5.2
PyYAML: 6.0.2
requests: 2.32.3
sentence-transformers: 3.1.0
SQLAlchemy: 2.0.35
tenacity: 8.5.0
tiktoken: 0.7.0
tokenizers: 0.19.1
transformers: 4.44.2
typing-extensions: 4.12.2
unstructured-client: 0.25.9
unstructured[all-docs]: Installed. No version info available.

@dosubot dosubot bot added the Ɑ: vector store Related to vector store module label Sep 24, 2024
@eyurtsev
Copy link
Collaborator

Use a list please. we haven't updated the type hints properly yet, but it just doesn't make sense passing the pagination logic to the implementation in this case

@eyurtsev
Copy link
Collaborator

The correct fix here is to update the type signature for add_texts through out the entire code base to force users to use Sequence for texts, metadata, ids and handle the pagination properly so pagination is not done on the implementation side.

@eyurtsev eyurtsev self-assigned this Sep 24, 2024
@eyurtsev eyurtsev added Ɑ: core Related to langchain-core 01 bug Confirmed bug labels Sep 24, 2024
Copy link

dosubot bot commented Dec 24, 2024

Hi, @speleo3. I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • The VectorStore.add_texts method fails with an iterator, despite type annotations suggesting it should accept an Iterable[str].
  • @eyurtsev suggested using a list and noted that type hints need updating.
  • The proposed solution is to update the type signature to require a Sequence for texts, metadata, and IDs.
  • You reacted positively to this proposed solution.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of LangChain. If so, you can keep the discussion open by commenting here.
  • If there are no further updates, this issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 bug Confirmed bug Ɑ: core Related to langchain-core stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

2 participants