Skip to content

Study how LM Evaluation Harness works and try to implement it #231

Open
@ggerganov

Description

@ggerganov

Update 10 Apr 2024: #231 (comment)


It would be great to start doing this kind of quantitative analysis of ggml-based inference:

https://bellard.org/ts_server/

It looks like Fabrice evaluates the models using something called LM Evaluation Harness:

https://github.com/EleutherAI/lm-evaluation-harness

I have no idea what this is yet, but would be nice to study it and try to integrate it here and in other ggml-based projects.
This will be very important step needed to estimate the quality of the generated output and see if we are on the right track.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions