Description
[ x] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
The code given in Single Hop Query Testset documentation does not work.
Ragas version: 0.2.5
Python version: 3.12.9
Ollama LLM used: "smollm2:1.7b-instruct-fp16"
Ollama Embeddings used for same model
Code to Reproduce
Take the code in Single Hop Query Testset documentation and make the following changes:
Since I have used Ollama LLM, hence the code for initialization of generator_llm
and generator_embeddings
is changed to
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_ollama.embeddings import OllamaEmbeddings
from langchain_ollama.llms import OllamaLLM
model_name="smollm2:1.7b-instruct-fp16"
embedding_fn = OllamaEmbeddings(model=model_name)
llm = OllamaLLM(
model=model_name, temperature=0
)
generator_llm = LangchainLLMWrapper(llm)
generator_embeddings = LangchainEmbeddingsWrapper(embedding_fn)
Then come the main change. Change the name of the personas as follows:
persona_first_time_flier = Persona(
name="First time flight taker", # <---- This line is changed
role_description="Is flying for the first time and may feel anxious. Needs clear guidance on flight procedures, safety protocols, and what to expect throughout the journey.",
)
persona_frequent_flier = Persona(
name="Frequently takes the flights", # <---- This line is changed
role_description="Travels regularly and values efficiency and comfort. Interested in loyalty programs, express services, and a seamless travel experience.",
)
persona_angry_business_flier = Persona(
name="Exclusively tavels in Business Class", # <---- This line is changed
role_description="Demands top-tier service and is easily irritated by any delays or issues. Expects immediate resolutions and is quick to express frustration if standards are not met.",
)
Error trace
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[16], line 1
----> 1 testset = generator.generate(testset_size=10, query_distribution=query_distibution)
2 testset.to_pandas()
File c:\Users\codechanger0\Documents\Projects\swipesmart\.venv\Lib\site-packages\ragas\testset\synthesizers\generate.py:413, in TestsetGenerator.generate(self, testset_size, query_distribution, num_personas, run_config, batch_size, callbacks, token_usage_parser, with_debugging_logs, raise_exceptions)
411 except Exception as e:
412 scenario_generation_rm.on_chain_error(e)
--> 413 raise e
414 else:
415 scenario_generation_rm.on_chain_end(
416 outputs={"scenario_sample_list": scenario_sample_list}
417 )
File c:\Users\codechanger0\Documents\Projects\swipesmart\.venv\Lib\site-packages\ragas\testset\synthesizers\generate.py:410, in TestsetGenerator.generate(self, testset_size, query_distribution, num_personas, run_config, batch_size, callbacks, token_usage_parser, with_debugging_logs, raise_exceptions)
401 exec.submit(
402 scenario.generate_scenarios,
403 n=splits[i],
(...) 406 callbacks=scenario_generation_grp,
407 )
409 try:
--> 410 scenario_sample_list: t.List[t.List[BaseScenario]] = exec.results()
411 except Exception as e:
412 scenario_generation_rm.on_chain_error(e)
...
57 if persona.name == key:
58 return persona
---> 59 raise KeyError(f"No persona found with name '{key}'")
KeyError: "No persona found with name 'Exclusively travels in Business Class'"
Expected behavior
The code should work. Essentially, we should be able to name the personas anything. LLM should not correct the names or search for unknown personas
Additional context
On investigating the issue, it was found that the issue root cause is in pydantic_prompt.py file-> generate_multiple().
On this line,
resp = await llm.generate(
prompt_value,
n=n,
temperature=temperature,
stop=stop,
callbacks=prompt_cb,
)
the LLM is returning persona names which is different than what was given in the prompt_value variable. The persona name given was
"Exclusively tavels in Business Class", but on this llm call returns "Exclusively travels in Business Class".
Notice the misspelled word "travel" in the code while creating Persona object. Due to this difference, the error of
KeyError: "No persona found with name 'Exclusively travels in Business Class'"
is thrown.OverflowError
Note: This issue is not limited to misspellings. For example, if a persona name of "First Credit Card Applier" is given, llm generates a persona name of "First Time Credit Card Applicant".