Description
- I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
LangChain recently released v0.3.When using LangChain v0.3, TestsetGenerator raises an ExceptionInRunner.
From v0.3, LangChain internally use pydantic v2. On the other hand, ragas internally uses langchain_core.pydantic_v1. This might be the cause of the error.
LangChain v0.3 migration guide is here.
https://python.langchain.com/docs/versions/v0_3/
Using LangChain v0.3 also introduces numerous potential errors, and it's necessary to update ragas to be compatible with LangChain v0.3.
Versions
I encountered this error on Google Colab.
!python --version
Python 3.10.12
!pip install langchain-core==0.3.0 langchain-openai==0.2.0 \
langchain-community==0.3.0 ragas==0.1.14 nest-asyncio==1.6.0
ragas 0.1.18 (latest) also raise the same Error.
Code to Reproduce
Share code to reproduce the issue
import os
from google.colab import userdata
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
import nest_asyncio
from langchain_core.documents import Document
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from ragas.testset.evolutions import multi_context, reasoning, simple
from ragas.testset.generator import TestsetGenerator
documents = [Document(page_content="sample", metadata={"source": "sample"})]
for document in documents:
document.metadata["filename"] = document.metadata["source"]
nest_asyncio.apply()
generator = TestsetGenerator.from_langchain(
generator_llm=ChatOpenAI(model="gpt-4o-mini"),
critic_llm=ChatOpenAI(model="gpt-4o-mini"),
embeddings=OpenAIEmbeddings(),
)
testset = generator.generate_with_langchain_docs(
documents,
test_size=4,
distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
)
Error trace
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in _VertexAIBase has conflict with protected namespace "model_".
You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
warnings.warn(
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in _VertexAICommon has conflict with protected namespace "model_".
You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
warnings.warn(
/usr/local/lib/python3.10/dist-packages/ragas/metrics/__init__.py:1: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain_core.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly.
For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. from pydantic.v1 import BaseModel
from ragas.metrics._answer_correctness import AnswerCorrectness, answer_correctness
/usr/local/lib/python3.10/dist-packages/ragas/metrics/__init__.py:4: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly.
For example, replace imports like: `from langchain.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. from pydantic.v1 import BaseModel
from ragas.metrics._context_entities_recall import (
---------------------------------------------------------------------------
ExceptionInRunner Traceback (most recent call last)
<ipython-input-3-e09f8b7c4199> in <cell line: 14>()
12 )
13
---> 14 testset = generator.generate_with_langchain_docs(
15 documents,
16 test_size=4,
2 frames
/usr/local/lib/python3.10/dist-packages/ragas/testset/docstore.py in add_nodes(self, nodes, show_progress)
251 results = executor.results()
252 if not results:
--> 253 raise ExceptionInRunner()
254
255 for i, n in enumerate(nodes):
ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.
Expected behavior
Generating testset success without any Error.
Additional context
Based on my investigation, I found that at least the implementation of default values for the Document class and Node class in ragas/testset/docstore.py is causing incorrect behavior.
ragas/src/ragas/testset/docstore.py
Lines 31 to 33 in c40891b
ragas/src/ragas/testset/docstore.py
Lines 82 to 86 in c40891b
For example, the embedding of Document is supposed to have None as its default value. However, for some reason, an instance of a Field object is being set instead. As a result, in the following section, the condition "n.embedding is None" is evaluated as False, leading to incorrect behavior.
ragas/src/ragas/testset/docstore.py
Line 233 in c40891b
It seems that using langchain_core.pydantic_v1 is also causing issues with the values of Fields in classes that inherit from BaseModel.
Same error issue
The same error is described in the following issue.