Skip to content

Commit

Permalink
example(dataset): add MKQA Dataset (#2090)
Browse files Browse the repository at this point in the history
  • Loading branch information
anda-ren authored Apr 14, 2023
1 parent 8e4876f commit 1a8eec6
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 0 deletions.
27 changes: 27 additions & 0 deletions example/datasets/LLM/MKQA/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: The `Multilingual Knowledge Questions & Answers` Dataset
---

## The Multilingual Knowledge Questions & Answers Dataset Description

- [Homepage](https://github.com/apple/ml-mkqa)

## The `mkqa` dataset Structure

### Data Fields

please refer to https://huggingface.co/datasets/mkqa

## Build `mkqa` sample Dataset locally

```shell
python3 dataset.py
```

## Example

Output the `10`th record of the `mkqa` dataset.

```shell
python3 example.py
```
10 changes: 10 additions & 0 deletions example/datasets/LLM/MKQA/dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from datasets import load_dataset

from starwhale import dataset

hg_ds = load_dataset("mkqa", split="train")
sw_ds = dataset("mkqa")
for item in enumerate(hg_ds):
sw_ds.append(item[1])
sw_ds.commit()
sw_ds.close()
5 changes: 5 additions & 0 deletions example/datasets/LLM/MKQA/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from starwhale import dataset

ds = dataset("mkqa")
row = ds[10]
print(row.features)
1 change: 1 addition & 0 deletions example/datasets/LLM/MKQA/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
datasets

0 comments on commit 1a8eec6

Please sign in to comment.