keep original input file intact by asamal4 · Pull Request #59 · lightspeed-core/lightspeed-evaluation

asamal4 · 2025-09-23T23:16:51Z

Summary by CodeRabbit

New Features
- Amended evaluation data is now saved as new files in the output directory with timestamped names, preserving original files. Logs and messages reflect the “amended” terminology.
Documentation
- Updated README to clarify that persistence writes amended responses/tool calls to an output directory.
Refactor
- Renamed the save step and related user-facing messages from “updated” to “amended” to align with the new persistence behavior.

coderabbitai · 2025-09-23T23:16:58Z

Walkthrough

Updates shift evaluation data persistence from overwriting the original file to writing a timestamped amended file in the output directory. Pipeline methods and logs are renamed from “updated” to “amended.” README text is aligned. No public API changes except a private method rename in EvaluationPipeline.

Changes

Cohort / File(s)	Summary of Changes
Documentation update `README.md`	Revised description of data persistence: now saves amended response/tool_calls to an output directory rather than persisting back to the original file.
Data persistence output change `src/lightspeed_evaluation/core/output/data_persistence.py`	Writes amended YAML to a timestamped file in the output directory; stops backing up and overwriting the original; returns path of the new amended file; updated logs/messages accordingly.
Pipeline terminology and save-flow update `src/lightspeed_evaluation/pipeline/evaluation/pipeline.py`, `src/lightspeed_evaluation/pipeline/evaluation/amender.py`	Renamed save step from “updated” to “amended”; replaced `_save_updated_data` with `_save_amended_data`; adjusted logs/messages and comments; run flow now uses amended-data terminology.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Pipeline as EvaluationPipeline
  participant Amender as Amender
  participant Persist as DataPersistence
  participant FS as FileSystem

  User->>Pipeline: run_evaluation()
  Pipeline->>Amender: amend_conversation_data(...)
  Amender-->>Pipeline: amended EvaluationData
  Pipeline->>Persist: save_amended_data(evaluation_data)
  Persist->>FS: write YAML to output_dir/timestamped_amended.yaml
  FS-->>Persist: path to amended file
  Persist-->>Pipeline: amended_file_path
  Pipeline-->>User: Log: "Amended data saved: <path>"
  note over Persist,FS: No backup or overwrite of original file

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I tap my paws—amend, don’t overwrite,
A timestamp twinkles in moonlit byte.
The burrow’s logs now clearly say,
“Amended saved,” then hop away.
In output fields my carrots grow—
New rows of YAML, neat in a row. 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "keep original input file intact" accurately and concisely reflects the primary change in this PR — preserving the original input by writing amended data to an output directory instead of overwriting it — and is directly related to the changeset while remaining short and clear.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

src/lightspeed_evaluation/core/output/data_persistence.py (3)
29-34: Reduce filename collision risk by including microseconds in timestamp.

Consecutive saves within a second can collide. Include microseconds.

Apply this diff:
-        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
36-45: Prefer safe_dump over dump for YAML.

You already serialize Pydantic models to dict; using safe_dump tightens guarantees.

Apply this diff:
-            yaml.dump(
+            yaml.safe_dump(
                 [conv_data.model_dump() for conv_data in evaluation_data],
                 f,
                 default_flow_style=False,
                 sort_keys=False,
                 allow_unicode=True,
                 indent=2,
             )
47-52: Use logger instead of print for consistency with pipeline logs.

Standardize on logging; avoids duplicate console prints and integrates with configured handlers.

Apply this diff:
-        print(f"💾 Amended evaluation data saved to: {amended_data_path}")
+        logger.info("Amended evaluation data saved to: %s", amended_data_path)
         return str(amended_data_path)
@@
-        print(f"❌ Failed to save amended evaluation data: {e}")
+        logger.warning("Failed to save amended evaluation data: %s", e)
Add these at the top of the file (outside the changed hunk):
import logging
logger = logging.getLogger(__name__)
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)
151-171: Handle unsuccessful save path explicitly.

If save_evaluation_data returns None (handled internally), log a warning so users aren’t left guessing.

Apply this diff:
             if amended_file:
                 logger.info("Amended data saved: %s", amended_file)
-                logger.info(
+                logger.info(
                     "To reuse without new API calls, disable API in the system config "
                     "and pass the amended file via --eval-data. The original file remains unchanged."
                 )
+            else:
+                logger.warning("Amended data was not saved (see prior errors).")
Optional improvement: allow saving even when original_data_path is None by falling back to a base filename from config. If you want, I can propose a small change to data_persistence to support that.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e25dec0 and 820a45c.

📒 Files selected for processing (4)

README.md (1 hunks)
src/lightspeed_evaluation/core/output/data_persistence.py (2 hunks)
src/lightspeed_evaluation/pipeline/evaluation/amender.py (1 hunks)
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-09T14:58:10.630Z

Learnt from: asamal4
PR: lightspeed-core/lightspeed-evaluation#47
File: src/lightspeed_evaluation/pipeline/evaluation/amender.py:32-41
Timestamp: 2025-09-09T14:58:10.630Z
Learning: In the lightspeed-evaluation framework, when API is enabled, every turn should make a fresh API call regardless of whether the turn already has response or tool_calls data. This ensures consistency and fresh responses for each evaluation run.

Applied to files:

README.md

🧬 Code graph analysis (1)

src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)

src/lightspeed_evaluation/core/output/data_persistence.py (1)

save_evaluation_data (14-52)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: pydocstyle
GitHub Check: mypy

🔇 Additional comments (2)

src/lightspeed_evaluation/pipeline/evaluation/amender.py (1)

43-43: Terminology alignment looks good.

The updated comment aligns with “amended” terminology and the retrieved learning (fresh API call per turn) remains satisfied.

src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)

28-29: LGTM: responsibility updated to “Save amended data.”

Matches PR objective and persistence changes.

README.md

src/lightspeed_evaluation/pipeline/evaluation/pipeline.py

VladimirKadlec

Thanks for this!

nit: I'd call it "output file"/"output data" instead of "amended file"/"amended data"

LGTM

tisnik

LGTM

no change to original eval file

820a45c

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

README.md Show resolved Hide resolved

src/lightspeed_evaluation/pipeline/evaluation/pipeline.py Show resolved Hide resolved

VladimirKadlec approved these changes Sep 24, 2025

View reviewed changes

tisnik approved these changes Sep 24, 2025

View reviewed changes

tisnik merged commit 89bf608 into lightspeed-core:main Sep 24, 2025
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keep original input file intact#59

keep original input file intact#59
tisnik merged 1 commit intolightspeed-core:mainfrom
asamal4:modify-data-save

asamal4 commented Sep 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 23, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

VladimirKadlec left a comment

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

asamal4 commented Sep 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

VladimirKadlec left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

asamal4 commented Sep 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 23, 2025 •

edited

Loading