Skip to content

[EVAL] Add JGLUE Test #455

Open
Open
@ryan-minato

Description

@ryan-minato

JGLUE is a widely used test set in the Japanese LLM research community, consisting of five sub-tests (with MARC-ja removed due to a request from Amazon):

  • MARC-ja
  • JSTS
  • JNLI
  • JSQuAD
  • JCommonsenseQA

Most of these sub-tests have English counterparts, and all of these are available for use under the CC-BY-SA-4.0 license.

JGLUE does not have an official dataset mirror on Hugging Face, and some of the tests lack community mirrors as well. I am currently processing and uploading the four sub-tests that remain accessible.

Due to performance issues with llm-jp-eval on my device, I am working on integrating the test sets used by llm-jp-eval into lighteval. If successful, this integration could greatly improve the evaluation of Japanese LLMs.

Evaluation metadata

Provide all available

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions