keep original input file intact#59
Conversation
WalkthroughUpdates shift evaluation data persistence from overwriting the original file to writing a timestamped amended file in the output directory. Pipeline methods and logs are renamed from “updated” to “amended.” README text is aligned. No public API changes except a private method rename in EvaluationPipeline. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Pipeline as EvaluationPipeline
participant Amender as Amender
participant Persist as DataPersistence
participant FS as FileSystem
User->>Pipeline: run_evaluation()
Pipeline->>Amender: amend_conversation_data(...)
Amender-->>Pipeline: amended EvaluationData
Pipeline->>Persist: save_amended_data(evaluation_data)
Persist->>FS: write YAML to output_dir/timestamped_amended.yaml
FS-->>Persist: path to amended file
Persist-->>Pipeline: amended_file_path
Pipeline-->>User: Log: "Amended data saved: <path>"
note over Persist,FS: No backup or overwrite of original file
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
src/lightspeed_evaluation/core/output/data_persistence.py (3)
29-34: Reduce filename collision risk by including microseconds in timestamp.Consecutive saves within a second can collide. Include microseconds.
Apply this diff:
- timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") + timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
36-45: Prefer safe_dump over dump for YAML.You already serialize Pydantic models to dict; using safe_dump tightens guarantees.
Apply this diff:
- yaml.dump( + yaml.safe_dump( [conv_data.model_dump() for conv_data in evaluation_data], f, default_flow_style=False, sort_keys=False, allow_unicode=True, indent=2, )
47-52: Use logger instead of print for consistency with pipeline logs.Standardize on logging; avoids duplicate console prints and integrates with configured handlers.
Apply this diff:
- print(f"💾 Amended evaluation data saved to: {amended_data_path}") + logger.info("Amended evaluation data saved to: %s", amended_data_path) return str(amended_data_path) @@ - print(f"❌ Failed to save amended evaluation data: {e}") + logger.warning("Failed to save amended evaluation data: %s", e)Add these at the top of the file (outside the changed hunk):
import logging logger = logging.getLogger(__name__)src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)
151-171: Handle unsuccessful save path explicitly.If save_evaluation_data returns None (handled internally), log a warning so users aren’t left guessing.
Apply this diff:
if amended_file: logger.info("Amended data saved: %s", amended_file) - logger.info( + logger.info( "To reuse without new API calls, disable API in the system config " "and pass the amended file via --eval-data. The original file remains unchanged." ) + else: + logger.warning("Amended data was not saved (see prior errors).")Optional improvement: allow saving even when original_data_path is None by falling back to a base filename from config. If you want, I can propose a small change to data_persistence to support that.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(1 hunks)src/lightspeed_evaluation/core/output/data_persistence.py(2 hunks)src/lightspeed_evaluation/pipeline/evaluation/amender.py(1 hunks)src/lightspeed_evaluation/pipeline/evaluation/pipeline.py(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-09T14:58:10.630Z
Learnt from: asamal4
PR: lightspeed-core/lightspeed-evaluation#47
File: src/lightspeed_evaluation/pipeline/evaluation/amender.py:32-41
Timestamp: 2025-09-09T14:58:10.630Z
Learning: In the lightspeed-evaluation framework, when API is enabled, every turn should make a fresh API call regardless of whether the turn already has response or tool_calls data. This ensures consistency and fresh responses for each evaluation run.
Applied to files:
README.md
🧬 Code graph analysis (1)
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)
src/lightspeed_evaluation/core/output/data_persistence.py (1)
save_evaluation_data(14-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: pydocstyle
- GitHub Check: mypy
🔇 Additional comments (2)
src/lightspeed_evaluation/pipeline/evaluation/amender.py (1)
43-43: Terminology alignment looks good.The updated comment aligns with “amended” terminology and the retrieved learning (fresh API call per turn) remains satisfied.
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)
28-29: LGTM: responsibility updated to “Save amended data.”Matches PR objective and persistence changes.
VladimirKadlec
left a comment
There was a problem hiding this comment.
Thanks for this!
nit: I'd call it "output file"/"output data" instead of "amended file"/"amended data"
LGTM
Summary by CodeRabbit
New Features
Documentation
Refactor