Open
Description
JGLUE is a widely used test set in the Japanese LLM research community, consisting of five sub-tests (with MARC-ja removed due to a request from Amazon):
- MARC-ja
- JSTS
- JNLI
- JSQuAD
- JCommonsenseQA
Most of these sub-tests have English counterparts, and all of these are available for use under the CC-BY-SA-4.0
license.
JGLUE does not have an official dataset mirror on Hugging Face, and some of the tests lack community mirrors as well. I am currently processing and uploading the four sub-tests that remain accessible.
Due to performance issues with llm-jp-eval
on my device, I am working on integrating the test sets used by llm-jp-eval
into lighteval
. If successful, this integration could greatly improve the evaluation of Japanese LLMs.
Evaluation metadata
Provide all available
- Paper url:
- Github url: https://github.com/yahoojapan/JGLUE
- Dataset url: