[LCORE-648] Fix processing of float('NaN') values when OutputParserException#48
Conversation
WalkthroughAdds math import and unified post-processing in ragas.evaluate: stores metric result in a local variable, refines OSError handling with a composed message (special-case errno 32 for broken-pipe/timeouts), checks for NaN in the metric score and returns a structured "malformed output" error before returning the final result. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant ragas_evaluate as ragas.evaluate
participant MetricFn as MetricFn
Caller->>ragas_evaluate: evaluate(input)
ragas_evaluate->>MetricFn: lookup_and_compute(...)
alt MetricFn raises OSError
MetricFn-->>ragas_evaluate: OSError
ragas_evaluate->>Caller: err_msg (includes timeout note if errno 32)
else MetricFn returns result
MetricFn-->>ragas_evaluate: result
alt result[0] is NaN
Note over ragas_evaluate: Guard for malformed LLM output
ragas_evaluate->>Caller: (None, "malformed output from the LLM")
else
ragas_evaluate->>Caller: result
end
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks (5 passed)✅ Passed checks (5 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bf635e2 to
8382f64
Compare
float('NaN') values when OutputParserException
162690e to
4a138ce
Compare
Proof of TestingWith The Fixlightspeed-evaluation reaches the end and generates the final report. Without The Fixlightspeed-evaluation does not generate the final report when RAGAS encountered the |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/metrics/ragas.py (1)
105-112: Broaden network OSError handling; avoid a single errno special-case.Consider catching other transient network errno/winerror codes (e.g., 104/110 on POSIX; 10054/10060 on Windows) and emitting a unified “network/LLM connectivity” message.
except OSError as e: - err_msg = f"Ragas {metric_name} evaluation failed: {str(e)}" - if e.errno == 32: # Broken pipe - err_msg = ( - f"Ragas {metric_name} evaluation failed due to broken pipe " - f"(network/LLM timeout): {str(e)}" - ) - return None, err_msg + err = getattr(e, "errno", None) + winerr = getattr(e, "winerror", None) + # EPIPE(32), ECONNRESET(104), ETIMEDOUT(110), EHOSTUNREACH(113), ENETUNREACH(101) + network_errnos = {32, 104, 110, 113, 101} + # Windows: WSAECONNRESET(10054), WSAETIMEDOUT(10060) + network_winerr = {10054, 10060} + if err in network_errnos or winerr in network_winerr: + return None, ( + f"Ragas {metric_name} evaluation failed due to network/LLM " + f"connectivity issue: {str(e)}" + ) + return None, f"Ragas {metric_name} evaluation failed: {str(e)}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/lightspeed_evaluation/core/metrics/ragas.py(3 hunks)
🔇 Additional comments (4)
src/lightspeed_evaluation/core/metrics/ragas.py (4)
3-3: Import for NaN checks — good.This enables the downstream NaN guard. No issues.
95-98: Unifying return path via localresult— good.This makes room for centralized post-processing and error normalization.
124-125: Final return after checks — good.Flow is clear and safe post-validation.
115-123: Ragas non-finite score guard in place – aggregates only append non-None scores, so no NaN/Inf can flow into statistics; no further changes required.
float('NaN') values when OutputParserExceptionfloat('NaN') values when OutputParserException
The RAGAS framework returns NaN when it encounters malformed output from the LLM. The malformed output is accompanied by an OutputParserException in the logs, but this exception is caught internally. The NaN causes later failure during the generation of statistics like standard deviation at the end of the evaluation and ultimately causes no results to be obtained from the evaluation when the malformed output is encountered by RAGAS. This commit fixes this issue by checking whether NaN was returned from RAGAS and, if so, ensures that the evaluate() function returns None, as in other cases of failure. This ensures that NaN does not reach the computation of the final statistics. Resolves: lightspeed-core#44
4a138ce to
05d0880
Compare
The RAGAS framework returns
float('NaN')when it encounters malformed output from the LLM. The malformed output is accompanied by an OutputParserException in the logs, but this exception is caught internally.The NaN causes a later failure during the generation of statistics like standard deviation at the end of the evaluation and ultimately causes no results to be obtained from the evaluation when the malformed output is encountered by RAGAS.
This commit fixes this issue by checking whether
float('NaN')was returned from RAGAS and, if so, ensures that theevaluate()function returns None, as in other cases of failure. This ensures that NaN does not reach the computation of the final statistics.Resolves: #44
Summary by CodeRabbit