Skip to content

chore: consolidate test_data models#131

Merged
asamal4 merged 1 commit intolightspeed-core:mainfrom
asamal4:consolidate-testdatamodel
Jan 8, 2026
Merged

chore: consolidate test_data models#131
asamal4 merged 1 commit intolightspeed-core:mainfrom
asamal4:consolidate-testdatamodel

Conversation

@asamal4
Copy link
Collaborator

@asamal4 asamal4 commented Jan 7, 2026

Description

Consolidate test cases for data models

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cursor

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added new evaluation data models for improved evaluation workflows.
    • Enhanced tool-call expectations to support flexible multi-set formats with fallback handling.
  • Tests

    • Expanded and consolidated test coverage with comprehensive validation scenarios for evaluation and turn data, plus updated validation messages and error handling.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

Walkthrough

Consolidates and expands unit tests for core data models: merges removed tests into a single module, adds Validation tests for new public models EvaluationData and EvaluationResult, and updates TurnData.expected_tool_calls validation to a flexible multi-set format with broader edge-case coverage.

Changes

Cohort / File(s) Summary
Tests (modified & removed)
tests/unit/core/models/test_data.py, tests/unit/core/models/test_data_additional.py
Consolidated tests into test_data.py (removed test_data_additional.py); expanded TurnData tests for minimal/empty fields and flexible multi-set expected_tool_calls shapes (single set, multiple sets, empty alternatives, complex sequences, None, regex-preserving args); updated validation message assertions and renamed/restructured several tests.
Public data models (new tests / exposure)
.../lightspeed_evaluation/core/models/data.py
Introduces/exports EvaluationData and EvaluationResult; tests validate EvaluationData (non-empty conversation_group_id, non-empty turns, reject empty turns) and EvaluationResult (default values, constrained status, non-negative execution_time, allow turn_id=None for conversation-level metrics).

Sequence Diagram(s)

(omitted — changes are test and model validations without multi-component runtime control flow)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • VladimirKadlec
  • tisnik
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'chore: consolidate test_data models' accurately describes the main change—consolidating test files (test_data_additional.py removed, tests merged into test_data.py) and expanding coverage for data models.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/unit/core/models/test_data.py (1)

273-286: Consider adding a positive test case for EvaluationData.

The class only tests rejection scenarios. Adding a test for successful creation with valid data would improve coverage and serve as documentation for expected usage.

💡 Suggested addition
def test_valid_evaluation_data(self):
    """Test EvaluationData with valid fields."""
    turn = TurnData(turn_id="turn1", query="Query")
    data = EvaluationData(conversation_group_id="conv1", turns=[turn])

    assert data.conversation_group_id == "conv1"
    assert len(data.turns) == 1
    assert data.turns[0].turn_id == "turn1"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a0778f5 and d6fefe1.

📒 Files selected for processing (2)
  • tests/unit/core/models/test_data.py
  • tests/unit/core/models/test_data_additional.py
💤 Files with no reviewable changes (1)
  • tests/unit/core/models/test_data_additional.py
🧰 Additional context used
📓 Path-based instructions (2)
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest mocking (mocker fixture) instead of unittest.mock

Files:

  • tests/unit/core/models/test_data.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Mirror test structure to source code organization, with test files named test_.py and test functions/classes named test_ and Test*

Files:

  • tests/unit/core/models/test_data.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 47
File: src/lightspeed_evaluation/core/output/generator.py:140-145
Timestamp: 2025-09-11T12:47:06.747Z
Learning: User asamal4 prefers that non-critical comments are sent when actual code changes are pushed, not on unrelated commits.
📚 Learning: 2025-10-31T11:54:59.126Z
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 90
File: src/lightspeed_evaluation/core/models/data.py:198-208
Timestamp: 2025-10-31T11:54:59.126Z
Learning: In the lightspeed_evaluation framework, the expected_tool_calls validator intentionally rejects a single empty set `[[]]` as the only alternative. This is by design: if no tool calls are expected, the tool_eval metric should not be configured for that turn. Empty sets are only valid as fallback alternatives (e.g., `[[[tool_call]], [[]]]`), representing optional tool call scenarios, not as primary or sole expectations.

Applied to files:

  • tests/unit/core/models/test_data.py
📚 Learning: 2025-09-18T23:59:37.026Z
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 55
File: src/lightspeed_evaluation/core/system/validator.py:146-155
Timestamp: 2025-09-18T23:59:37.026Z
Learning: In the lightspeed-evaluation project, the DataValidator in `src/lightspeed_evaluation/core/system/validator.py` is intentionally designed to validate only explicitly provided user evaluation data, not resolved metrics that include system defaults. When turn_metrics is None, the system falls back to system config defaults, and this validation separation is by design.

Applied to files:

  • tests/unit/core/models/test_data.py
📚 Learning: 2026-01-06T10:26:30.274Z
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 127
File: src/lightspeed_evaluation/core/api/streaming_parser.py:122-149
Timestamp: 2026-01-06T10:26:30.274Z
Learning: DataValidationError in src/lightspeed_evaluation/core/system/exceptions.py is used exclusively for data file validation scenarios (YAML parsing, Pydantic model validation, file existence checks) in validator.py, not for runtime API response validation errors.

Applied to files:

  • tests/unit/core/models/test_data.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: tests (3.13)
🔇 Additional comments (5)
tests/unit/core/models/test_data.py (5)

1-33: LGTM!

The imports are minimal and appropriate. TestTurnData covers the essential required field validations (minimal fields, empty turn_id, empty query) with correct use of pytest.raises.


36-186: LGTM!

Comprehensive coverage of expected_tool_calls validation scenarios. The tests correctly verify:

  • Format conversion (single set → multiple sets)
  • Empty alternatives as valid fallbacks (but not as sole/first alternative)
  • Rejection of invalid structures and redundant empty alternatives

This aligns with the retrieved learning that empty sets are only valid as fallback alternatives, not as primary expectations.


189-214: LGTM!

Format detection tests correctly verify empty list rejection and single set format detection logic.


216-271: LGTM!

Thorough validation coverage for expected_keywords including valid cases, None handling, and rejection of invalid inputs (non-list, empty inner lists, empty strings).


289-340: LGTM!

Good coverage for EvaluationResult including default values, status validation, execution time constraints, and conversation-level metrics with None turn_id. The tests appropriately verify the model's constraints.

@asamal4 asamal4 force-pushed the consolidate-testdatamodel branch from d6fefe1 to 8d7add5 Compare January 7, 2026 14:53
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/unit/core/models/test_data.py (2)

238-238: Consider adding a None check for consistency.

Other tests in this file (e.g., lines 52, 70-71, 87-88) assert is not None before accessing properties. Adding this check here would improve test clarity and provide a better error message if the assertion fails.

♻️ Suggested improvement
+        assert turn_data.expected_keywords is not None
         assert len(turn_data.expected_keywords) == 2

308-359: Good coverage of EvaluationResult validation.

Tests correctly verify:

  • Default values for optional fields
  • Constraint on result status values
  • Non-negative execution_time enforcement
  • None is valid for turn_id (conversation-level metrics)

Consider adding a parameterized test for all valid result status values (e.g., PASS, FAIL, ERROR, SKIPPED) to ensure they're all accepted. This would complement the invalid status rejection test.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6fefe1 and 8d7add5.

📒 Files selected for processing (2)
  • tests/unit/core/models/test_data.py
  • tests/unit/core/models/test_data_additional.py
💤 Files with no reviewable changes (1)
  • tests/unit/core/models/test_data_additional.py
🧰 Additional context used
📓 Path-based instructions (2)
**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest mocking (mocker fixture) instead of unittest.mock

Files:

  • tests/unit/core/models/test_data.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Mirror test structure to source code organization, with test files named test_.py and test functions/classes named test_ and Test*

Files:

  • tests/unit/core/models/test_data.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 47
File: src/lightspeed_evaluation/core/output/generator.py:140-145
Timestamp: 2025-09-11T12:47:06.747Z
Learning: User asamal4 prefers that non-critical comments are sent when actual code changes are pushed, not on unrelated commits.
📚 Learning: 2025-10-31T11:54:59.126Z
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 90
File: src/lightspeed_evaluation/core/models/data.py:198-208
Timestamp: 2025-10-31T11:54:59.126Z
Learning: In the lightspeed_evaluation framework, the expected_tool_calls validator intentionally rejects a single empty set `[[]]` as the only alternative. This is by design: if no tool calls are expected, the tool_eval metric should not be configured for that turn. Empty sets are only valid as fallback alternatives (e.g., `[[[tool_call]], [[]]]`), representing optional tool call scenarios, not as primary or sole expectations.

Applied to files:

  • tests/unit/core/models/test_data.py
📚 Learning: 2025-09-18T23:59:37.026Z
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 55
File: src/lightspeed_evaluation/core/system/validator.py:146-155
Timestamp: 2025-09-18T23:59:37.026Z
Learning: In the lightspeed-evaluation project, the DataValidator in `src/lightspeed_evaluation/core/system/validator.py` is intentionally designed to validate only explicitly provided user evaluation data, not resolved metrics that include system defaults. When turn_metrics is None, the system falls back to system config defaults, and this validation separation is by design.

Applied to files:

  • tests/unit/core/models/test_data.py
📚 Learning: 2026-01-06T10:26:30.274Z
Learnt from: asamal4
Repo: lightspeed-core/lightspeed-evaluation PR: 127
File: src/lightspeed_evaluation/core/api/streaming_parser.py:122-149
Timestamp: 2026-01-06T10:26:30.274Z
Learning: DataValidationError in src/lightspeed_evaluation/core/system/exceptions.py is used exclusively for data file validation scenarios (YAML parsing, Pydantic model validation, file existence checks) in validator.py, not for runtime API response validation errors.

Applied to files:

  • tests/unit/core/models/test_data.py
🧬 Code graph analysis (1)
tests/unit/core/models/test_data.py (1)
src/lightspeed_evaluation/core/models/data.py (1)
  • TurnData (36-315)
🔇 Additional comments (5)
tests/unit/core/models/test_data.py (5)

1-11: LGTM!

Imports are clean and correctly reference the three models under test. The consolidation of EvaluationData and EvaluationResult alongside TurnData aligns with the PR objective.


13-33: LGTM!

Good coverage of TurnData required field validation. The tests correctly verify min_length=1 constraints on turn_id and query, and confirm optional fields default to None.


36-186: Comprehensive validation coverage.

Tests thoroughly exercise the expected_tool_calls validation logic including:

  • Format conversion (single → multiple sets)
  • Empty alternatives as fallback (correctly aligned with the design that empty sets are valid only as fallback alternatives, not as primary or sole expectations)
  • Rejection of invalid structures, empty sequences, and redundant empty alternatives

The error message assertions (match=...) correctly verify the specific validation paths are being triggered.


189-213: LGTM!

Format detection tests correctly validate that:

  1. Empty list [] is rejected as "the only alternative"
  2. Single-set format is converted to one alternative containing multiple sequences

This aligns with the framework's intentional design where empty sets are only valid as fallback alternatives. Based on learnings, the validation correctly rejects [[]] as the sole expectation.


273-305: Good foundational coverage for EvaluationData.

Tests correctly validate required field constraints (non-empty conversation_group_id, non-empty turns list). The test structure is clear and follows the established patterns.

Copy link
Contributor

@VladimirKadlec VladimirKadlec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asamal4 asamal4 merged commit bb643b0 into lightspeed-core:main Jan 8, 2026
15 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Jan 14, 2026
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants