Skip to content

Commit

Permalink
resolve #213, resolve #217, contribui para #216 #219
Browse files Browse the repository at this point in the history
  • Loading branch information
leoalenc committed Mar 15, 2023
1 parent 67a4c78 commit a748467
Show file tree
Hide file tree
Showing 4 changed files with 592 additions and 358 deletions.
32 changes: 16 additions & 16 deletions data/corpus/universal-dependencies/yrl_complin-ud-test.conllu
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@
2 ti ti PART NEG Polarity=Neg 3 advmod _ TokenRange=4:6
3 asasá sasá VERB V Number=Sing|Person=1|VerbForm=Fin 0 root _ TokenRange=7:12
4 i i PRON PRON2 Case=Gen|PronType=Prs 5 expl _ TokenRange=13:14
5 puxí puxí ADV ADV _ 3 advmod _ TokenRange=15:19
5 puxí puxí ADV ADVA AdvType=Man 3 advmod _ TokenRange=15:19
6 mayé mayé ADV ADVLA AdvType=Man|PronType=Rel 3 advmod _ TokenRange=20:24
7 aintá aintá PRON PRON Number=Plur|Person=3|PronType=Prs 8 nsubj _ TokenRange=25:30
8 umbeú mbeú VERB V Person=3|VerbForm=Fin 6 advcl:relcl _ SpaceAfter=No|TokenRange=31:36
Expand Down Expand Up @@ -552,7 +552,7 @@
# sent_id = MooreFP1994:1:11:38
# text = Aikwé awá ururi indé u reyuri-putari tẽ ne rupí?
# text_eng = Was there anybody to bring you or did you yourself want to come?
# text_por = Havia alguém para trazer você ou você mesmo queria vir?
# text_por = Havia alguém para trazer você ou você queria mesmo vir?
# text_source = p. 112
# text_orig = aikwé áwa urúi ĩndé o reyúwi putái te nerupí...
1 Aikwé aikwé PART EXST PartType=Exs 3 advmod _ TokenRange=0:5
Expand Down Expand Up @@ -1467,7 +1467,7 @@
11 irumu irumu ADP ADP _ 10 case _ TokenRange=62:67
12 ukwawa kwawa VERB V Person=3|VerbForm=Fin 9 ccomp _ TokenRange=68:74
13 arama arama SCONJ SCONJ _ 12 mark _ TokenRange=75:80
14 mayesawa mayesawa ADV ADVR _ 12 ccomp _ TokenRange=81:89
14 mayesawa mayesawa ADV ADVRA AdvType=Man|PronType=Int 12 ccomp _ TokenRange=81:89
15 taá taá PART CQ PartType=Int 14 advmod _ TokenRange=90:93
16 kwá kwá PRON DEMX Deixis=Prox|Number=Sing|PronType=Dem 14 nsubj _ TokenRange=94:97
17 umbeú mbeú VERB V Person=3|VerbForm=Fin 16 acl:relcl _ TokenRange=98:103
Expand Down Expand Up @@ -3701,14 +3701,14 @@
# text = Ape tẽ uikú nhaã iwitera makití asú-putari wirandé.
# text_eng = Right there is the mountain to which I want to go tomorrow.
# text_por = Lá mesmo está a serra à qual eu quero ir amanhã.
# text_source =
# text_source = Avila 2021
# text_orig = Apé tẽ uikú nhaã iwitera makití asú-putari wirandé.
1 Ape ape ADV ADVD PronType=Dem 0 root _ TokenRange=0:3
2 tẽ tẽ PART FOC Foc=Yes|PartType=Emp 1 advmod _ TokenRange=4:6
3 uikú ikú AUX COP Person=3|VerbForm=Fin 1 cop _ TokenRange=7:11
4 nhaã nhaã DET DEMS Deixis=Remt|Number=Sing|PronType=Dem 5 det _ TokenRange=12:16
5 iwitera iwitera NOUN N Number=Sing 1 nsubj _ TokenRange=17:24
6 makití makití ADV ADVL PronType=Rel 7 advmod _ TokenRange=25:31
6 makití makití ADV ADVLC AdvType=Loc|PronType=Rel 7 advmod _ TokenRange=25:31
7-8 asú-putari _ _ _ _ _ _ _ TokenRange=32:42
7 asú sú VERB V Number=Sing|Person=1|VerbForm=Fin 5 acl:relcl _ _
8 putari putari AUX AUXN Compound=Yes|VerbForm=Inf 7 aux _ _
Expand All @@ -3718,12 +3718,12 @@
# sent_id = Avila2021:0:0:160
# text = Yamanú ramé, makití yasú?
# text_eng = When we die, where do we go?
# text_por = Quando morremos, para onde vamos? [corr. transl.]
# text_por = Quando morremos, para onde vamos? [Corr. transl.]
# text_source = Aguiar, 33, adap.
1 Yamanú manú VERB V Number=Plur|Person=1|VerbForm=Fin 5 advcl _ TokenRange=0:6
2 ramé ramé SCONJ SCONJ _ 1 mark _ SpaceAfter=No|TokenRange=7:11
3 , , PUNCT PUNCT _ 1 punct _ TokenRange=11:12
4 makití makití ADV ADVR _ 1 advmod _ TokenRange=13:19
4 makití makití ADV ADVRC AdvType=Man|PronType=Int 1 advmod _ TokenRange=13:19
5 yasú sú VERB V Number=Plur|Person=1|VerbForm=Fin 0 root _ SpaceAfter=No|TokenRange=20:24
6 ? ? PUNCT PUNCT _ 5 punct _ SpaceAfter=No|TokenRange=24:25

Expand All @@ -3734,7 +3734,7 @@
# text_source = Amorim, 182, adap.
1 Apekatú apekatú ADV ADV _ 8 advmod _ SpaceAfter=No|TokenRange=0:7
2 , , PUNCT PUNCT _ 5 punct _ TokenRange=7:8
3 makití makití ADV ADVL PronType=Rel 5 advmod _ TokenRange=9:15
3 makití makití ADV ADVLC AdvType=Loc|PronType=Rel 5 advmod _ TokenRange=9:15
4 kurasí kurasí NOUN N Number=Sing 5 nsubj _ TokenRange=16:22
5 uyenú yenú VERB V Person=3|VerbForm=Fin 1 acl:relcl _ SpaceAfter=No|TokenRange=23:28
6 , , PUNCT PUNCT _ 5 punct _ TokenRange=28:29
Expand Down Expand Up @@ -3810,7 +3810,7 @@
# text_source = Rodrigues, 149, adap.
1 ― ― PUNCT PUNCT _ 3 punct _ TokenRange=-1:0
2 Ti ti PART NEG PartType=Neg|Polarity=Neg 3 advmod _ TokenRange=1:3
3 makití makití ADV ADVL _ 0 root _ SpaceAfter=No|TokenRange=4:10
3 makití makití ADV ADVRC AdvType=Loc|PronType=Int 0 root _ SpaceAfter=No|TokenRange=4:10
4 . . PUNCT PUNCT _ 3 punct _ SpaceAfter=No|TokenRange=10:11

# sent_id = Avila2021:0:0:167
Expand Down Expand Up @@ -4608,7 +4608,7 @@
10 rumasá tumasá NOUN N Number=Sing|Rel=Cont 2 obl _ _
11 pe upé ADP ADP Clitic=Yes 10 case _ _
12 , , PUNCT PUNCT _ 15 punct _ TokenRange=58:59
13 masuí masuí ADV ADVL PronType=Rel 15 advmod _ TokenRange=60:65
13 masuí masuí ADV ADVLC AdvType=Loc|PronType=Rel 15 advmod _ TokenRange=60:65
14 kariwa-itá kariwa NOUN N Number=Plur 15 nsubj _ TokenRange=66:76
15 umusãi musãi VERB V Person=3|VerbForm=Fin 10 acl:relcl _ TokenRange=77:83
16 aintá aintá PRON PRON Number=Plur|Person=3|PronType=Prs 15 obj _ TokenRange=84:89
Expand Down Expand Up @@ -8056,7 +8056,7 @@
1 Se se PRON PRON2 Case=Gen|Number=Sing|Person=1|Poss=Yes|PronType=Prs 2 nmod:poss _ TokenRange=0:2
2 iwá-itá iwá NOUN N Number=Plur 0 root _ SpaceAfter=No|TokenRange=3:10
3 , , PUNCT PUNCT _ 5 punct _ TokenRange=10:11
4 masuí masuí ADV ADVL PronType=Rel 5 advmod _ TokenRange=12:17
4 masuí masuí ADV ADVLC AdvType=Loc|PronType=Rel 5 advmod _ TokenRange=12:17
5 usinhĩ sinhĩ VERB V Person=3|VerbForm=Fin 2 acl:relcl _ TokenRange=18:24
6 kurí kurí PART FUT Tense=Fut 5 advmod _ TokenRange=25:29
7 amú-itá amú PRON IND Number=Plur|PronType=Ind 5 nsubj _ TokenRange=30:37
Expand Down Expand Up @@ -8100,7 +8100,7 @@
1 Mikura mikura NOUN N Number=Sing 3 nsubj _ TokenRange=0:6
2 usú sú AUX AUXFR Person=3|VerbForm=Fin 3 aux _ TokenRange=7:10
3 uyenú yenú VERB V Person=3|VerbForm=Fin 0 root _ TokenRange=11:16
4 marupí marupí ADV ADVR PronType=Int 3 advmod _ TokenRange=17:23
4 marupí marupí ADV ADVRC AdvType=Loc|PronType=Int 3 advmod _ TokenRange=17:23
5 apigawa apigawa NOUN N Number=Sing 6 nsubj _ TokenRange=24:31
6 usasá sasá VERB V Person=3|VerbForm=Fin 4 acl:relcl _ TokenRange=32:37
7 arama arama SCONJ SCONJ _ 6 mark _ TokenRange=38:43
Expand Down Expand Up @@ -8141,7 +8141,7 @@
# text_source = Avila 2021
1 Nhaã nhaã DET DEMS Deixis=Remt|Number=Sing|PronType=Dem 2 det _ TokenRange=0:4
2 iwité iwité NOUN N Number=Sing 6 nsubj _ TokenRange=5:10
3 marupí marupí ADV ADVL AdvType=Loc|PronType=Rel 4 advmod _ TokenRange=11:17
3 marupí marupí ADV ADVLC AdvType=Loc|PronType=Rel 4 advmod _ TokenRange=11:17
4 yasasá sasá VERB V Number=Plur|Person=1|VerbForm=Fin 2 acl:relcl _ TokenRange=18:24
5 kwesé kwesé ADV ADV _ 4 advmod _ TokenRange=25:30
6 apekatú apekatú ADV ADVC AdvType=Loc 0 root _ TokenRange=31:38
Expand Down Expand Up @@ -8180,7 +8180,7 @@
3 umunhã munhã VERB V Person=3|VerbForm=Fin 0 root _ TokenRange=10:16
4 yepé yepé DET ART Definite=Ind|PronType=Art 5 det _ TokenRange=17:21
5 barraca barraca NOUN N Number=Sing 3 obj _ OrigLang=pt|TokenRange=22:29
6 mamé mamé ADV ADVL AdvType=Loc|PronType=Rel 8 advmod _ TokenRange=30:34
6 mamé mamé ADV ADVLC AdvType=Loc|PronType=Rel 8 advmod _ TokenRange=30:34
7 aintá aintá PRON PRON Number=Plur|Person=3|PronType=Prs 8 nsubj _ TokenRange=35:40
8 ukiri kiri VERB V Person=3|VerbForm=Fin 5 acl:relcl _ TokenRange=41:46
9 arama arama SCONJ SCONJ _ 8 mark _ SpaceAfter=No|TokenRange=47:52
Expand Down Expand Up @@ -9170,7 +9170,7 @@
4 piá piá NOUN N Number=Sing 2 nsubj _ SpaceAfter=No|TokenRange=17:20
5 , , PUNCT PUNCT _ 9 punct _ TokenRange=20:21
6 asuí asuí CCONJ CCONJ _ 9 cc _ TokenRange=22:26
7 maita maita ADV ADVR PronType=Int 9 advmod _ TokenRange=27:32
7 maita maita ADV ADVRA AdvType=Man|PronType=Int 9 advmod _ TokenRange=27:32
8 kurí kurí PART FUT Tense=Fut 9 advmod _ TokenRange=33:37
9 ambeú mbeú VERB V Number=Sing|Person=1|VerbForm=Fin 2 conj _ SpaceAfter=No|TokenRange=38:43
10 ? ? PUNCT PUNCT _ 2 punct _ SpaceAfter=No|TokenRange=43:44
Expand Down Expand Up @@ -9889,7 +9889,7 @@
3 umunhã munhã VERB V Person=3|VerbForm=Fin 0 root _ TokenRange=10:16
4 yepé yepé DET ART Definite=Ind|PronType=Art 5 det _ TokenRange=17:21
5 baraka baraka NOUN N Number=Sing 3 obj _ Orig=barraca|OrigLang=pt|TokenRange=22:28
6 mamé mamé ADV ADVL AdvType=Loc|PronType=Rel 8 advmod _ TokenRange=29:33
6 mamé mamé ADV ADVLC AdvType=Loc|PronType=Rel 8 advmod _ TokenRange=29:33
7 aintá aintá PRON PRON Number=Plur|Person=3|PronType=Prs 8 nsubj _ TokenRange=34:39
8 ukiri kiri VERB V Person=3|VerbForm=Fin 5 acl:relcl _ TokenRange=40:45
9 arama arama SCONJ SCONJ _ 8 mark _ SpaceAfter=No|TokenRange=46:51
Expand Down
44 changes: 44 additions & 0 deletions data/glossary.json
Original file line number Diff line number Diff line change
Expand Up @@ -1934,6 +1934,16 @@
"pos": "s.",
"gloss": "macaco"
},
{
"lemma": "makaku",
"pos": "s.",
"gloss": "var. makaka"
},
{
"lemma": "pusanú",
"pos": "v.",
"gloss": "curar"
},
{
"lemma": "makaxera",
"pos": "s.",
Expand Down Expand Up @@ -2177,6 +2187,11 @@
"pos": "adj.",
"gloss": "cansado"
},
{
"lemma": "kukura",
"pos": "s.",
"gloss": "tipo de árvore"
},
{
"lemma": "maraari",
"pos": "v. 2ª cl.",
Expand Down Expand Up @@ -7404,11 +7419,40 @@
"pos": "v.",
"gloss": "amarrar-se"
},
{
"lemma": "sembika",
"pos": "adj.",
"gloss": "salgado"
},
{
"lemma": "tenundé",
"pos": "s.",
"gloss": "a frente, o que está adiante",
"rel": [
"renundé",
"senundé"
]
},
{
"lemma": "arukanga",
"pos": "s.",
"gloss": "1) costela; 2) lado; 3) beira, beirada"
},
{
"lemma": "pukwari",
"pos": "v.",
"gloss": "amarrar"
},
{
"lemma": "yatimana",
"pos": "v.",
"gloss": "rodear"
},
{
"lemma": "piripiriaka",
"pos": "s.",
"gloss": "tipo de erva"
},
{
"lemma": "yupukwáu",
"pos": "v.",
Expand Down
8 changes: 8 additions & 0 deletions data/glossary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,8 @@ maité (v.) - pensar
setuna (v.) - cheirar
apisá-kwara (s.) - ouvido
makaka (s.) - macaco
makaku (s.) - var. makaka
pusanú (v.) - curar
makaxera (s.) - macaxera
kera (s.) - guerra, batalha, luta
makira (s.) - rede
Expand Down Expand Up @@ -421,6 +423,7 @@ manusawa (s.) - morte
manú (v.) - morrer
marã (adv. interr. caus.) - var. marama
maraari (adj.) - cansado
kukura (s.) - tipo de árvore
maraari (se) (v. 2ª cl.) - estar cansado, cansar-se
marama (adv. interr. caus.) - por quê? para quê?
maramunhangara (s.) - guerreiro
Expand Down Expand Up @@ -1417,7 +1420,12 @@ yupirú (v.) - começar
yupisirú (v.) - salvar-se
-yu- (pref.) 1. (refl. / recípr.) - se; um(uns) ao(s) outro(s); 2. part. de voz passiva
yupukwari (v.) - amarrar-se
sembika (adj.) - salgado
tenundé (renundé, senundé) (s.) - a frente, o que está adiante
arukanga (s.) - 1) costela; 2) lado; 3) beira, beirada
pukwari (v.) - amarrar
yatimana (v.) - rodear
piripiriaka (s.) - tipo de erva
yupukwáu (v.) - acostumar-se
yupukwá (v.) - var. yupukwáu
yupukwawa (v.) - var. yupukwáu
Expand Down
Loading

0 comments on commit a748467

Please sign in to comment.