Skip to content

LangChain v0.3 not supported. For example, TestsetGenerator raise ExceptionInRunner with LangChain v0.3. #1328

Closed
@os1ma

Description

@os1ma
  • I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug

LangChain recently released v0.3.When using LangChain v0.3, TestsetGenerator raises an ExceptionInRunner.

From v0.3, LangChain internally use pydantic v2. On the other hand, ragas internally uses langchain_core.pydantic_v1. This might be the cause of the error.

LangChain v0.3 migration guide is here.
https://python.langchain.com/docs/versions/v0_3/

Using LangChain v0.3 also introduces numerous potential errors, and it's necessary to update ragas to be compatible with LangChain v0.3.

Versions

I encountered this error on Google Colab.

!python --version
Python 3.10.12
!pip install langchain-core==0.3.0 langchain-openai==0.2.0 \
    langchain-community==0.3.0 ragas==0.1.14 nest-asyncio==1.6.0

ragas 0.1.18 (latest) also raise the same Error.

Code to Reproduce
Share code to reproduce the issue

import os
from google.colab import userdata

os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
import nest_asyncio
from langchain_core.documents import Document
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from ragas.testset.evolutions import multi_context, reasoning, simple
from ragas.testset.generator import TestsetGenerator

documents = [Document(page_content="sample", metadata={"source": "sample"})]

for document in documents:
    document.metadata["filename"] = document.metadata["source"]

nest_asyncio.apply()

generator = TestsetGenerator.from_langchain(
    generator_llm=ChatOpenAI(model="gpt-4o-mini"),
    critic_llm=ChatOpenAI(model="gpt-4o-mini"),
    embeddings=OpenAIEmbeddings(),
)

testset = generator.generate_with_langchain_docs(
    documents,
    test_size=4,
    distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
)

Error trace

/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in _VertexAIBase has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in _VertexAICommon has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/ragas/metrics/__init__.py:1: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain_core.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly.

For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from ragas.metrics._answer_correctness import AnswerCorrectness, answer_correctness
/usr/local/lib/python3.10/dist-packages/ragas/metrics/__init__.py:4: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly.

For example, replace imports like: `from langchain.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from ragas.metrics._context_entities_recall import (
---------------------------------------------------------------------------
ExceptionInRunner                         Traceback (most recent call last)
<ipython-input-3-e09f8b7c4199> in <cell line: 14>()
     12 )
     13 
---> 14 testset = generator.generate_with_langchain_docs(
     15     documents,
     16     test_size=4,

2 frames
/usr/local/lib/python3.10/dist-packages/ragas/testset/docstore.py in add_nodes(self, nodes, show_progress)
    251         results = executor.results()
    252         if not results:
--> 253             raise ExceptionInRunner()
    254 
    255         for i, n in enumerate(nodes):

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.

Expected behavior

Generating testset success without any Error.

Additional context

Based on my investigation, I found that at least the implementation of default values for the Document class and Node class in ragas/testset/docstore.py is causing incorrect behavior.

class Document(LCDocument):
doc_id: str = Field(default_factory=lambda: str(uuid.uuid4()))
embedding: t.Optional[t.List[float]] = Field(default=None, repr=False)

class Node(Document):
keyphrases: t.List[str] = Field(default_factory=list, repr=False)
relationships: t.Dict[Direction, t.Any] = Field(default_factory=dict, repr=False)
doc_similarity: t.Optional[float] = Field(default=None, repr=False)
wins: int = 0

For example, the embedding of Document is supposed to have None as its default value. However, for some reason, an instance of a Field object is being set instead. As a result, in the following section, the condition "n.embedding is None" is evaluated as False, leading to incorrect behavior.

if n.embedding is None:

It seems that using langchain_core.pydantic_v1 is also causing issues with the values of Fields in classes that inherit from BaseModel.

Same error issue

The same error is described in the following issue.

#1319

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmodule-testsetgenModule testset generation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions