Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atributo de metadado obrigatório text_source ausente #587

Open
1 task
leoalenc opened this issue Sep 21, 2024 · 1 comment
Open
1 task

atributo de metadado obrigatório text_source ausente #587

leoalenc opened this issue Sep 21, 2024 · 1 comment
Assignees
Labels
corpus This issue pertains to corpus data invalid This doesn't seem right metadata Improvementes or explanations about metadata UD Annotation This issue relates to Universal Dependencies annotation

Comments

@leoalenc
Copy link
Contributor

  • incluir atributo de metadado obrigatório text_source ausente nas sentenças abaixo
Alencar2021:0:0:1
Alencar2021:0:0:2
Alencar2021:0:0:3
Alencar2021:0:0:4
Alencar2021:0:0:5
Alencar2021:0:0:6
Alencar2021:0:0:7
Alencar2021:0:0:8
Alencar2021:0:0:9
Navarro2016:0:0:401
Navarro2016:0:0:402
Navarro2016:0:0:403

@leoalenc leoalenc added invalid This doesn't seem right corpus This issue pertains to corpus data UD Annotation This issue relates to Universal Dependencies annotation metadata Improvementes or explanations about metadata labels Sep 21, 2024
@leoalenc leoalenc self-assigned this Sep 21, 2024
@heliolbs
Copy link
Collaborator

heliolbs commented Sep 26, 2024

@leoalenc , dentre as funções que tenho prototipado para um possível módulo de verificação do treebank, há uma que escrevi para verificar os atributos obrigatórios dos metadados das sentenças. A função parte da seguinte lista de atributos obrigatórios para checar cada sentença do treebank: sent_id, text, text_eng, text_por, text_source, text_annotator.

A função retornou não somente todas as sentenças do seu comentário acima, mas também as quatro abaixo, dentre as quais uma só tem text_eng_ggl, mas não text_eng (entendo que não seja problema) e as outras três não têm nenhum dos dois (seria um problema?):

Sentença text_eng text_eng_ggl
Amorim1928:19:50:50
Hartt1938:0:0:63
Hartt1938:0:0:64
Hartt1938:0:0:288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
corpus This issue pertains to corpus data invalid This doesn't seem right metadata Improvementes or explanations about metadata UD Annotation This issue relates to Universal Dependencies annotation
Projects
None yet
Development

No branches or pull requests

2 participants