We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Having the ability for benchmark builders to create a dataset and a spec that will be read with lighteval.
A dataset, with the benchmark data A python file that defines the prompt metric etc OR a yaml file if you want to use regular metrics.