You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
incluir scripts para décupla validação cruzada de parsing com toquenização e etiquetas ouro
melhorar scripts
incluir scripts para décupla validação cruzada de parsing de texto cru
Scripts para replicação dos experimentos deste artigo:
ALENCAR, Leonel Figueiredo de. A Universal Dependencies Treebank for Nheengatu. In: GAMALLO, Pablo; CLARO, Daniela; TEIXEIRA, António J. S.; REAL, Livy; GARCÍA, Marcos; OLIVEIRA, Hugo Gonçalo; AMARO, Raquel (Eds.). Proceedings of the 16th International Conference on Computational Processing of Portuguese, PROPOR 2024, Santiago de Compostela, Galicia/Spain, 12-15 March, 2024. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. v. 2, p. 37-54. Available at: https://aclanthology.org/2024.propor-2.8.
@inproceedings{DeAlencar2024a,
author = "de Alencar, Leonel Figueiredo",
editor = {Pablo Gamallo and Daniela Claro and Ant{\'{o}}nio J. S. Teixeira and Livy Real and Marcos Garc{\'{\i}}a and Hugo Gon{\c{c}}alo Oliveira and Raquel Amaro},
title = "A {U}niversal {D}ependencies Treebank for {N}heengatu",
booktitle = {Proceedings of the 16th International Conference on Computational Processing of Portuguese, {PROPOR} 2024, Santiago de Compostela, Galicia/Spain, 12-15 March, 2024},
pages = "37--54",
volume = {2},
publisher = {Association for Computational Linguistics},
year = {2024},
month = {3},
url = "https://aclanthology.org/2024.propor-2.8",
address = {Stroudsburg, PA, USA},
abstract="We present UD_Nheengatu-CompLin, the inaugural treebank for Nheengatu, an endangered Indigenous language of Brazil with limited digital resources. This treebank stands as the largest among Indigenous American languages in version 2.13 of the Universal Dependencies collection. The developmental version comprises 1,336 trees, encompassing 13,246 tokens and 13,374 words. In a 10-fold cross-validation experiment using UDPipe 1.2, parsing with gold tokenization and gold tags achieved a labeled attachment score (LAS) of 81.17 ± 1.02, outperforming Yauti, the rule-based analyzer employed for sentence annotation.",
isbn = {979-8-89176-062-2, doi = "10.5281/zenodo.11372209"}
}
The text was updated successfully, but these errors were encountered:
Scripts para replicação dos experimentos deste artigo:
ALENCAR, Leonel Figueiredo de. A Universal Dependencies Treebank for Nheengatu. In: GAMALLO, Pablo; CLARO, Daniela; TEIXEIRA, António J. S.; REAL, Livy; GARCÍA, Marcos; OLIVEIRA, Hugo Gonçalo; AMARO, Raquel (Eds.). Proceedings of the 16th International Conference on Computational Processing of Portuguese, PROPOR 2024, Santiago de Compostela, Galicia/Spain, 12-15 March, 2024. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. v. 2, p. 37-54. Available at: https://aclanthology.org/2024.propor-2.8.
The text was updated successfully, but these errors were encountered: