Skip to content

fix: add reference to simple scoring #1758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/howtos/applications/_metrics_llm_calls.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Debug LLM based metrics using tracing
# Explain or debug LLM based metrics using tracing

While evaluating using LLM based metrics, each metric may make one or more calls to the LLM. These traces are important to understand the results of the metrics and to debug any issues.
This notebook demonstrates how to export the LLM traces and analyze them.
Expand Down
20 changes: 8 additions & 12 deletions src/ragas/metrics/_simple_criteria.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,7 @@ class SingleTurnSimpleCriteriaInput(BaseModel):


class MultiTurnSimpleCriteriaInput(BaseModel):
user_input: t.Optional[str] = Field(
description="The input to the model", default=None
)
user_input: str = Field(description="The input to the model")
reference: t.Optional[str] = Field(
description="The reference response", default=None
)
Expand Down Expand Up @@ -172,20 +170,18 @@ async def _single_turn_ascore(
async def _ascore(self, row: t.Dict, callbacks: Callbacks) -> float:
assert self.llm is not None, "set LLM before use"

user_input, context, response = (
row["user_input"],
user_input, response, retrieved_contexts, reference = (
row.get("user_input"),
row.get("response"),
row.get("retrieved_contexts"),
row["response"],
row.get("reference"),
)

if context is not None:
if isinstance(context, list):
context = "\n".join(context)
user_input = f"Question: {user_input} Answer using context: {context}"

prompt_input = SingleTurnSimpleCriteriaInput(
user_input=user_input,
response=response,
retrieved_contexts=retrieved_contexts,
reference=reference,
)

response = await self.single_turn_prompt.generate(
Expand All @@ -200,11 +196,11 @@ async def _multi_turn_ascore(
self, sample: MultiTurnSample, callbacks: Callbacks
) -> float:
assert self.llm is not None, "LLM is not set"
assert sample.reference is not None, "Reference is not set"

interaction = sample.pretty_repr()
prompt_input = MultiTurnSimpleCriteriaInput(
user_input=interaction,
reference=sample.reference,
)
response = await self.multi_turn_prompt.generate(
data=prompt_input,
Expand Down
Loading