Skip to content

feat(testing): ensure consistent test behavior coverage across all implementations #14

@eddmann

Description

@eddmann

Summary

Ensure all santa-lang implementations have the same level of test behavior coverage as Comet (the primary Rust implementation).

Description

Comet serves as the reference implementation and has comprehensive test coverage for language behavior. Other implementations (Blitzen, Dasher, Donner, Vixen, Prancer) should have equivalent test suites to guarantee consistent behavior across all execution environments.

Goals

  • Audit Comet's test suite to establish the baseline coverage
  • Identify gaps in other implementations' test coverage
  • Create shared test specifications that all implementations must pass
  • Ensure edge cases and corner cases are tested consistently

Proposed Approach

1. Shared Test Specifications

Create a language-agnostic test specification format (e.g., YAML/JSON) that defines:

  • Input santa-lang code
  • Expected output/result
  • Expected errors (for error handling tests)
- name: "fibonacci recursive"
  code: |
    let fib = |n| match n {
      0 => 0,
      1 => 1,
      n => fib(n - 1) + fib(n - 2)
    };
    fib(10)
  expected: 55

- name: "division by zero"
  code: "1 / 0"
  error: "division by zero"

2. Test Categories

  • Parser tests - syntax acceptance/rejection
  • Evaluator tests - expression evaluation
  • Builtin tests - all builtin functions
  • Pattern matching tests - destructuring and match expressions
  • Sequence tests - lazy evaluation, infinite ranges
  • Runner tests - AoC DSL behavior
  • Error handling tests - error messages and recovery

3. Implementation Matrix

Track which tests pass on which implementation:

Test Suite Comet Blitzen Dasher Donner Vixen Prancer
Parser ? ? ? ? ?
Builtins ? ? ? ? ?
Sequences ? ? ? ? ?
... ... ... ... ... ... ...

Tasks

  • Export/document Comet's existing test cases
  • Define shared test specification format
  • Create test runner that can execute specs against any implementation
  • Add CI job to run shared tests against all implementations
  • Document any intentional behavioral differences between implementations

Notes

  • Vixen implements a subset of santa-lang, so some tests may be marked as "not applicable"
  • Performance-related tests may have different thresholds per implementation
  • Focus on behavioral correctness, not implementation details

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions