You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
No. I can't contribute a fix for this bug at this time.
What component(s) are affected?
Python SDK
Opik UI
Opik Server
Documentation
Opik version
Opik version: 1.0.2
Describe the problem
When calling evaluate using built-in metrics Hallucination or Answer Relevance I get a JSON format related error and evaluation fails.
Reproduction steps
Snippet:
# Define the metrics
hallucination_metric = Hallucination(name="Hallucination")
answerrelevance_metric = AnswerRelevance(name="AnswerRelevance")
SWEEP_ID = "03"
for i, prompt in enumerate(system_prompts):
SYSTEM_PROMPT = prompt
experiment_config = {"system_prompt": SYSTEM_PROMPT, "model": "gpt-3.5-turbo"}
experiment_name = f"comet-chatbot-{SWEEP_ID}-{i}"
res = evaluate(
experiment_name=experiment_name,
dataset=dataset,
experiment_config=experiment_config,
task=evaluation_task,
scoring_metrics=[hallucination_metric,
answerrelevance_metric]
)
Error:
Evaluation: 0%| | 0/5 [00:00<?, ?it/s]OPIK: Failed to compute metric Hallucination. Score result will be marked as failed.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/opik/evaluation/metrics/llm_judges/hallucination/metric.py", line 121, in _parse_model_output
dict_content = json.loads(content)
File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/opik/evaluation/tasks_scorer.py", line 29, in _score_test_case
result = metric.score(**score_kwargs)
File "/usr/local/lib/python3.10/dist-packages/opik/evaluation/metrics/llm_judges/hallucination/metric.py", line 87, in score
return self._parse_model_output(model_output)
File "/usr/local/lib/python3.10/dist-packages/opik/evaluation/metrics/llm_judges/hallucination/metric.py", line 130, in _parse_model_output
raise exceptions.MetricComputationError(
opik.evaluation.metrics.exceptions.MetricComputationError: Failed hallucination detection
The text was updated successfully, but these errors were encountered:
Willingness to contribute
No. I can't contribute a fix for this bug at this time.
What component(s) are affected?
Opik version
Describe the problem
When calling evaluate using built-in metrics Hallucination or Answer Relevance I get a JSON format related error and evaluation fails.
Reproduction steps
Snippet:
Error:
The text was updated successfully, but these errors were encountered: