Skip to content

Empty responses should not be tested but should fail  #92

@zimmski

Description

@zimmski

See https://github.com/symflower/eval-dev-quality/blob/3e7dc8c5beab65f5958a458a593823ba5c25698e/docs/reports/v0.4.0/openrouter_databricks_dbrx-instruct/java/java/plain.log for an example. You can see that the see that the Java tests are executed even though there is no single character to compile nor execute.

Tasks:

  • Return an error if there is an empty response (trim content for whitespaces before checking, that might have been the problem in the first place)
  • Result should be that a model that returns an empty response should not receive any additional metrics. There should be a test for that, so we make sure that such a model response never leads to more points. If there is an empty response, it is an almost fail.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions