KaWAT contains word analogy task for Indonesian. The raw data is stored under
syntax
and semantic
directory for syntactic and semantic analogy questions
respectively. To convert the raw data into Google's Analogy Task format, you
must build the dataset by invoking make
(make sure to have Python 3.6 in
your PATH
). The build results will be stored under build
directory. Invoke
make help
to see other available commands.
This dataset is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
The code is licensed under the Apache License, Version 2.0.
If you use KaWAT in your work, please cite:
@article{kurniawan2019,
title={KaWAT: A Word Analogy Task Dataset for Indonesian},
url={http://arxiv.org/abs/1906.09912},
journal={arXiv:1906.09912 [cs]},
author={Kurniawan, Kemal},
year={2019},
month={Jun}
}