Skip to content

Commit fd6359d

Browse files
committed
Notes on Indonesian transformers based on some initial experiments on POS. Need to do constituency and/or NER as well
1 parent 00ca29c commit fd6359d

File tree

1 file changed

+15
-0
lines changed

1 file changed

+15
-0
lines changed

stanza/utils/training/common.py

+15
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,21 @@ class Mode(Enum):
104104
# test: 2022-03-04 INFO: fi_turku 91.36
105105
"fi": "TurkuNLP/bert-base-finnish-cased-v1",
106106

107+
# Indonesian POS experiments: dev set of GSD
108+
# python3 stanza/utils/training/run_pos.py id_gsd
109+
# 89.95
110+
# flax-community/indonesian-roberta-large
111+
# 89.78 (!)
112+
# flax-community/indonesian-roberta-base
113+
# 90.14
114+
# indolem/indobert-base-uncased
115+
# 90.21
116+
# cahya/bert-base-indonesian-1.5G
117+
# 90.32
118+
# cahya/roberta-base-indonesian-1.5G
119+
# 90.40
120+
"id": "cahya/roberta-base-indonesian-1.5G",
121+
107122
# from https://github.com/idb-ita/GilBERTo
108123
# annoyingly, it doesn't handle cased text
109124
# supposedly there is an argument "do_lower_case"

0 commit comments

Comments
 (0)