Skip to content

Releases: lightspeed-core/lightspeed-evaluation

LightSpeed Evaluation v0.4.0

03 Feb 10:16
89b8ea7

Choose a tag to compare

What's Changed

Key Changes

  • Flexible Tool Evaluation: Configurable ordered/unordered & full/partial match modes for tool call validation
  • Classical Evaluation Metrics: Support for traditional evaluation metrics (bleu, rouge, distance metrics)
  • Alternate Expected Response: Ability to set alternate ground-truth responses for static evaluation metrics
  • Eval Configuration Tracking: Evaluation configuration details now included in generated reports for better reproducibility
  • API Latency Metrics: Latency tracking and reporting for API performance analysis (for API streaming endpoint)
  • Data Grouping: Tag-based grouping of evaluation conversations for better organization
  • Data Filtering: Filter evaluation datasets by tags and conversation IDs (CLI arguments) for targeted testing
  • Cache Warmup: New optional CLI argument to pre-warm (clear) caches before evaluation runs

Pull Requests

New Contributors

Full Changelog: v0.3.0...v0.4.0

LightSpeed Evaluation v0.3.0

30 Dec 18:28
0f8df44

Choose a tag to compare

What's Changed

Key Changes

  • Token Usage Statistics: Track and report token consumption during evaluations (both API and JudgeLLM usage)
  • Certificate Support for JudgeLLM: Configure custom certificates when connecting to Judge LLM endpoints
  • Skip on Failure: Optional config to skip remaining evaluations in a conversation group when any evaluation criteria fails
  • Optional Packages: torch and nvidia-* packages are now optional, significantly reducing install size for use cases that don't require them

PRs

New Contributors

Full Changelog: v0.2.0...v0.3.0

LightSpeed Evaluation v0.2.0

02 Dec 14:00
7665def

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.1.0...v0.2.0

LightSpeed Evaluation v0.1.0

10 Oct 15:12
f92850a

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: https://github.com/lightspeed-core/lightspeed-evaluation/commits/v0.1.0