LEADS-2: Fix Path Object Serialization in Amended YAML Files by bsatapat-jpg · Pull Request #100 · lightspeed-core/lightspeed-evaluation

bsatapat-jpg · 2025-11-11T13:03:40Z

Issue Link: https://issues.redhat.com/browse/LEADS-2

Summary by CodeRabbit

Improvements
- Evaluation export now emits JSON-serialized fields within YAML while preserving Unicode and indentation for readability.
Tests
- Added unit tests for evaluation data persistence covering path serialization, missing scripts, and invalid output handling.
- Added extensive tests for Gemini embedding provider integration, config validation, and optional caching.

coderabbitai · 2025-11-11T13:03:49Z

Warning

Rate limit exceeded

@bsatapat-jpg has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 15 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7925af7 and 35ed83e.

📒 Files selected for processing (2)

src/lightspeed_evaluation/core/output/data_persistence.py (1 hunks)
tests/unit/core/output/test_data_persistence.py (1 hunks)

Walkthrough

Replaces YAML-friendly serialization of evaluation elements with Pydantic JSON-mode serialization (model_dump(mode="json")) before YAML dumping; preserves YAML options (allow_unicode=True, indent=2). Adds unit tests covering Path-string serialization, missing scripts, and invalid output_dir error handling.

Changes

Cohort / File(s)	Summary
Data persistence code `src/lightspeed_evaluation/core/output/data_persistence.py`	Convert each evaluation element via `model_dump(mode="json")` prior to `yaml.dump`; retain `allow_unicode=True` and `indent=2` YAML settings; error handling and output-file writing flow unchanged.
Data persistence tests `tests/unit/core/output/test_data_persistence.py`	New tests for `save_evaluation_data`: ensure `Path` objects are serialized to strings in YAML, string paths remain strings, handle absent scripts, and return `None` when `output_dir` is not a directory.
Gemini embedding tests `tests/unit/core/test_embedding_gemini.py`	New tests validating Gemini provider integration in embedding config/manager flows, including provider validation, provider_kwargs forwarding, and optional disk caching (mocked external dependencies).

Sequence Diagram(s)

sequenceDiagram
  participant Caller as save_evaluation_data
  participant EvalData as evaluation_data (list)
  participant Serializer as model_dump(mode="json")
  participant YAML as yaml.dump
  participant FS as FileSystem

  Note over Caller,EvalData: for each evaluation element
  Caller->>EvalData: iterate elements
  EvalData->>Serializer: element.model_dump(mode="json")
  Serializer-->>Caller: json-friendly dict
  Caller->>YAML: yaml.dump(list_of_json_dicts, allow_unicode=True, indent=2)
  YAML-->>FS: write timestamped amended file
  FS-->>Caller: file path / or raise OSError
  alt yaml error
    YAML-->>Caller: raise yaml.YAMLError
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pay special attention to:
- Correctness of model_dump(mode="json") usage across all evaluation element types.
- Tests' expectations for serialized path formats and handling of absent scripts.
- Any edge cases where downstream YAML consumers rely on the previous YAML-friendly representation.

Suggested reviewers

asamal4
VladimirKadlec
tisnik

Poem

🐰 I hop through dumps with tidy flair,

model_dump makes data fair,
Paths turn string and YAML gleams,
Indents and unicode fulfill the dreams,
A little rabbit cheers the streams 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the main change: fixing Path object serialization in YAML files by using JSON mode serialization.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c91d444 and 1976fe6.

📒 Files selected for processing (2)

src/lightspeed_evaluation/core/output/data_persistence.py (1 hunks)
tests/unit/core/output/test_data_persistence.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

src/lightspeed_evaluation/**

📄 CodeRabbit inference engine (AGENTS.md)

Add all new evaluation features under src/lightspeed_evaluation/ (do not add new features elsewhere)

Files:

src/lightspeed_evaluation/core/output/data_persistence.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Require type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels

Files:

src/lightspeed_evaluation/core/output/data_persistence.py

tests/unit/**

📄 CodeRabbit inference engine (AGENTS.md)

Place unit tests under tests/unit/ mirroring the source structure

Files:

tests/unit/core/output/test_data_persistence.py

tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

Name test files as test_*.py

Files:

tests/unit/core/output/test_data_persistence.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for mocking (pytest-mock’s mocker), not unittest.mock
Name test functions as test_*
Name test classes as Test*
Add comprehensive tests for new features with mocked LLM calls using pytest

Files:

tests/unit/core/output/test_data_persistence.py

🧠 Learnings (2)

📚 Learning: 2025-10-16T11:17:19.324Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-16T11:17:19.324Z
Learning: Applies to config/evaluation_data.yaml : Keep evaluation data in config/evaluation_data.yaml (conversation groups, turns, overrides, scripts)

Applied to files:

src/lightspeed_evaluation/core/output/data_persistence.py
tests/unit/core/output/test_data_persistence.py

📚 Learning: 2025-10-16T11:17:19.324Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-16T11:17:19.324Z
Learning: When adding a new feature, also add a new sample evaluation data YAML file

Applied to files:

src/lightspeed_evaluation/core/output/data_persistence.py

🧬 Code graph analysis (1)

tests/unit/core/output/test_data_persistence.py (2)

src/lightspeed_evaluation/core/models/data.py (2)

EvaluationData (264-311)

TurnData (35-261)

src/lightspeed_evaluation/core/output/data_persistence.py (1)

save_evaluation_data (14-52)

🪛 GitHub Actions: Ruff

tests/unit/core/output/test_data_persistence.py

[error] 3-3: F401: 'tempfile' imported but unused. Remove unused import or use it in tests.

[error] 6-6: F401: 'pytest' imported but unused. Remove unused import or use it in tests.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: mypy
GitHub Check: tests (3.11)
GitHub Check: tests (3.13)
GitHub Check: tests (3.12)

🔇 Additional comments (5)

src/lightspeed_evaluation/core/output/data_persistence.py (1)

39-39: LGTM! Path serialization fix correctly implemented.

The use of model_dump(mode="json") ensures that Pydantic Path fields are serialized as strings rather than Python path objects, directly addressing the issue described in LEADS-2. The additional YAML formatting options (allow_unicode=True and indent=2) improve output readability.

Also applies to: 43-44

tests/unit/core/output/test_data_persistence.py (4)

13-78: Excellent test coverage for Path object serialization.

This test comprehensively verifies that Path objects are correctly serialized as strings in the YAML output, including verification that no Python object artifacts (like PosixPath) remain in the file. This directly validates the fix for LEADS-2.

80-125: LGTM! Good regression test for string paths.

This test ensures that string path inputs continue to work correctly and aren't inadvertently converted or corrupted by the serialization changes.

127-157: LGTM! Good coverage for optional script fields.

This test ensures that evaluation data without script references (which are optional fields) saves correctly, providing good edge case coverage.

159-184: LGTM! Error handling test is correctly structured.

This test verifies that the function gracefully handles errors and returns None when the output directory path is invalid (pointing to a file instead of a directory). The mkdir call on line 27 of data_persistence.py will raise an OSError, which is properly caught and handled.

tests/unit/core/output/test_data_persistence.py

coderabbitai

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1976fe6 and 7925af7.

📒 Files selected for processing (3)

src/lightspeed_evaluation/core/output/data_persistence.py (1 hunks)
tests/unit/core/output/test_data_persistence.py (1 hunks)
tests/unit/core/test_embedding_gemini.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unit/core/output/test_data_persistence.py

🧰 Additional context used

📓 Path-based instructions (5)

tests/unit/**

📄 CodeRabbit inference engine (AGENTS.md)

Place unit tests under tests/unit/ mirroring the source structure

Files:

tests/unit/core/test_embedding_gemini.py

tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

Name test files as test_*.py

Files:

tests/unit/core/test_embedding_gemini.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for mocking (pytest-mock’s mocker), not unittest.mock
Name test functions as test_*
Name test classes as Test*
Add comprehensive tests for new features with mocked LLM calls using pytest

Files:

tests/unit/core/test_embedding_gemini.py

src/lightspeed_evaluation/**

📄 CodeRabbit inference engine (AGENTS.md)

Add all new evaluation features under src/lightspeed_evaluation/ (do not add new features elsewhere)

Files:

src/lightspeed_evaluation/core/output/data_persistence.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Require type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels

Files:

src/lightspeed_evaluation/core/output/data_persistence.py

🧠 Learnings (2)

📚 Learning: 2025-10-16T11:17:19.324Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-16T11:17:19.324Z
Learning: Applies to tests/**/*.py : Add comprehensive tests for new features with mocked LLM calls using pytest

Applied to files:

tests/unit/core/test_embedding_gemini.py

📚 Learning: 2025-10-16T11:17:19.324Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-16T11:17:19.324Z
Learning: Applies to config/evaluation_data.yaml : Keep evaluation data in config/evaluation_data.yaml (conversation groups, turns, overrides, scripts)

Applied to files:

src/lightspeed_evaluation/core/output/data_persistence.py

🧬 Code graph analysis (1)

tests/unit/core/test_embedding_gemini.py (3)

src/lightspeed_evaluation/core/embedding/manager.py (2)

EmbeddingError (7-8)

EmbeddingManager (11-39)

src/lightspeed_evaluation/core/embedding/ragas.py (1)

RagasEmbeddingManager (12-37)

src/lightspeed_evaluation/core/models/system.py (1)

EmbeddingConfig (78-114)

🪛 GitHub Actions: Ruff

tests/unit/core/test_embedding_gemini.py

[error] 6-6: F401: 'lightspeed_evaluation.core.embedding.manager.EmbeddingError' imported but unused.

[error] 81-81: F841: Local variable 'ragas_manager' is assigned to but never used.

[error] 103-103: F841: Local variable 'ragas_manager' is assigned to but never used.

[error] 130-130: F841: Local variable 'ragas_manager' is assigned to but never used.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: mypy
GitHub Check: tests (3.13)
GitHub Check: tests (3.12)

🔇 Additional comments (4)

src/lightspeed_evaluation/core/output/data_persistence.py (1)

39-39: LGTM! Proper fix for Path serialization.

The addition of mode="json" ensures that Path objects and other non-JSON-serializable types are converted to strings before YAML serialization, preventing PosixPath or WindowsPath artifacts in the output.

tests/unit/core/test_embedding_gemini.py (3)

13-37: LGTM! Well-structured configuration tests.

The test class comprehensively validates Gemini provider configuration, including valid provider acceptance, provider_kwargs handling, and invalid provider rejection.

40-64: LGTM! Proper manager validation tests.

The tests correctly verify environment validation and configuration retention for the Gemini provider.

139-158: LGTM! Comprehensive integration test.

The test effectively verifies that all three embedding providers (huggingface, openai, gemini) work correctly, ensuring Gemini integrates well with existing providers.

tests/unit/core/test_embedding_gemini.py

asamal4

LGTM

tisnik

LGTM

coderabbitai bot reviewed Nov 11, 2025

View reviewed changes

tests/unit/core/output/test_data_persistence.py Outdated Show resolved Hide resolved

bsatapat-jpg force-pushed the dev branch from 1976fe6 to 7925af7 Compare November 11, 2025 13:08

coderabbitai bot reviewed Nov 11, 2025

View reviewed changes

bsatapat-jpg force-pushed the dev branch from 7925af7 to e5daf37 Compare November 11, 2025 13:19

LEADS-2: Fix Path Object Serialization in Amended YAML Files

35ed83e

bsatapat-jpg force-pushed the dev branch from e5daf37 to 35ed83e Compare November 11, 2025 13:21

asamal4 approved these changes Nov 12, 2025

View reviewed changes

tisnik approved these changes Nov 12, 2025

View reviewed changes

tisnik merged commit 24f68c4 into lightspeed-core:main Nov 12, 2025
15 checks passed

bsatapat-jpg deleted the dev branch November 13, 2025 05:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LEADS-2: Fix Path Object Serialization in Amended YAML Files#100

LEADS-2: Fix Path Object Serialization in Amended YAML Files#100
tisnik merged 1 commit intolightspeed-core:mainfrom
bsatapat-jpg:dev

bsatapat-jpg commented Nov 11, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 11, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asamal4 left a comment

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bsatapat-jpg commented Nov 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asamal4 left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bsatapat-jpg commented Nov 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 11, 2025 •

edited

Loading