Skip to content

Commit

Permalink
encoding must be UTF-8
Browse files Browse the repository at this point in the history
  • Loading branch information
fginter committed Apr 26, 2019
1 parent ce08840 commit a1e0659
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions _ja/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ udver: '2'
language phenomena in real texts.

* Many SUWs correspond to a single English word but they tend to be shorter than
English counterparts. An example is "フランス 語" (French; French language).
English counterparts. An example is "フランス 語" (French; French language).
For detailed definition please refer to [3] written in Japanese.

* The automatic tokenization accuracy is more than 98% on in-domain data (BCCWJ) [4].
Expand All @@ -33,9 +33,9 @@ In Proceedings of the Sixth International Conference on Language Resources and E
Kikuo Maekawa, Makoto Yamazaki, Toshinobu Ogiso, Takehiko Maruyama, Hideki Ogura, Wakako Kashino, Hanae Koiso, Masaya Yamaguchi, Makiro Tanaka, and Yasuharu Den
Language Resources and Evaluation Vol. 48 345-371, May 2014.

[3] 『現代日本語書き言葉均衡コーパス』形態論情報規程集(上)(下)
小椋 秀樹, 小磯 花絵, 冨士池 優美, 宮内 佐夜香, 小西 光, and 原 裕,
独立行政法人国立国語研究所, 2011.
[3] 『現代日本語書き言葉均衡コーパス』形態論情報規程集(上)(下)
小椋 秀樹, 小磯 花絵, 冨士池 優美, 宮内 佐夜香, 小西 光, and 原 裕,
独立行政法人国立国語研究所, 2011.

[4] Language Resource Addition: Dictionary or Corpus?,
Shinsuke Mori and Graham Neubig,
Expand Down

0 comments on commit a1e0659

Please sign in to comment.