Skip to content

How to evaluate on a smaller split of the data? #8

@armsp

Description

@armsp

Thank you so very much for this project!
Currently I am using your framework to evaluate some models I have trained, however for me all 26529 samples are not necessary to run the evaluation. Moreover I have not yet figured out how to run it faster since it takes up quite a bit of time.

I understand that there is a --split parameter that seems to take the value SuperGPQA-all. I filtered the dataset just to have the fields I am interested in and made a new data file and named it as supergpqa-mmp.jsonl and I passed the argument to --split as supergpqa-mmp, but that did not work.

Could you please tell me how do I run the evaluation/inference on just a subset of fields or disciplines that I am interested in?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions