Open
Description
Here's a well-structured issue report based on your problem:
- I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
I am experiencing an issue while generating a test set in Spanish using ragas
version 0.1.21. The error occurs when calling generate_with_langchain_docs()
, and it appears to be related to JSON validation. The error message suggests that the output is not in a valid JSON format.
Ragas version
0.1.21
Python version
(Provide your Python version, e.g., Python 3.10.6
)
Code to Reproduce
def generateTestset(
chunks: list[Document],
embedding_function=None,
llm_type: str = "ollama",
llm_model: str = "mistral"
):
generator_llm = llm_model
critic_llm = llm_model
embeddings = embedding_function
if llm_type == "ollama":
generator = TestsetGenerator.from_langchain(generator_llm, critic_llm, embeddings)
elif llm_type == "openai":
generator = TestsetGenerator.with_openai()
# adapt to language
language = "spanish"
generator.adapt("spanish", evolutions=[simple, reasoning, multi_context])
generator.save(evolutions=[simple, reasoning, multi_context])
testset = generator.generate_with_langchain_docs(
chunks,
test_size=10,
distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25}
)
final_testset = testset.to_pandas()
Error trace
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Prompt
root
output in example 1 is not in valid json format: Expecting value: line 1 column 1 (char 0) (type=value_error)
Expected behavior
I expected generate_with_langchain_docs()
to return a valid test set without JSON-related errors.
Additional context
- I am using
ollama
withmistral
as the model. - The error suggests that the generated output might not be properly formatted JSON.
- I followed the official documentation for
ragas
version 0.1.21.
Any guidance on resolving this issue would be greatly appreciated!