Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problema na análise automática da reduplicação parcial em caso de "code mixing" #578

Open
1 task
leoalenc opened this issue Sep 18, 2024 · 1 comment
Open
1 task
Assignees
Labels
bug Something isn't working morphological parsing Lemmatization and morphological analysis tools This issue relates to Python code UD Annotation This issue relates to Universal Dependencies annotation unknown-words How to deal with new words

Comments

@leoalenc
Copy link
Contributor

leoalenc commented Sep 18, 2024

  • etiqueta especial =red para análise automática da reduplicação parcial produz erro em caso de "code mixing"
# sent_id = Cruz2011:0:0:119
# text = Ti maã aganaganari.
# text_eng = TODO
# text_por = Não enganaria.
# text_source = Example No. 1029 Wr
# text_orig = ti=maã a-gana~ganai
# text_annotator = Leonel Figueiredo de Alencar
# inputline = Ti maã/cond aganaganari.
1	Ti	ti	PART	NEG	PartType=Neg|Polarity=Neg	0	advmod	_	TokenRange=0:2
2	maã	maã	PART	COND	Mood=Cnd|PartType=Mod	0	advmod	_	TokenRange=3:6
3	aganaganari	aganaganari	_	_	_	0	_	_	SpaceAfter=No|TokenRange=7:18
4	.	.	PUNCT	PUNCT	_	1	punct	_	SpaceAfter=No|TokenRange=18:19

A aplicação da função parseExample sobre a string 'Ti maã/cond aganaganari/=red:l|4.' ou 'Ti maã/cond aganaganari/=red:l|4:o|pt:s|enganar.' gera este erro:

Traceback (most recent call last):
  File "/usr/lib/python3.8/idlelib/run.py", line 559, in runcode
    exec(code, self.locals)
  File "<pyshell#57>", line 1, in <module>
  File "<pyshell#56>", line 3, in pz
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 2849, in parseExample
    output=handleSents(sents,pref,textid,index,sentid,annotator,metadata)
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 2720, in handleSents
    tk=parseSentence(sents[0])
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 2997, in parseSentence
    tk=mkConlluSentence(tokens)
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 2564, in mkConlluSentence
    new=handlePartialRedup(form,length)
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 2009, in handlePartialRedup
    handleOrig(new,lemma,orig, orig_form)
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 1961, in handleOrig
    _isInLexicon(lemma)
  File "/home/leonel/Dropbox/scripts/AnnotateConllu.py", line 1949, in _isInLexicon
    raise Exception(f"Lemma '{lemma}' not found in the lexicon")
Exception: Lemma 'ganari' not found in the lexicon
@leoalenc leoalenc added bug Something isn't working tools This issue relates to Python code morphological parsing Lemmatization and morphological analysis UD Annotation This issue relates to Universal Dependencies annotation unknown-words How to deal with new words labels Sep 18, 2024
@leoalenc leoalenc self-assigned this Sep 18, 2024
@leoalenc
Copy link
Contributor Author

Relaciona-se com #199 #155 #527 #512 #523 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working morphological parsing Lemmatization and morphological analysis tools This issue relates to Python code UD Annotation This issue relates to Universal Dependencies annotation unknown-words How to deal with new words
Projects
None yet
Development

No branches or pull requests

1 participant