Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erratic behaviour of the letter 'y' #14

Open
MdeHpy opened this issue Sep 11, 2020 · 2 comments
Open

Erratic behaviour of the letter 'y' #14

MdeHpy opened this issue Sep 11, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@MdeHpy
Copy link

MdeHpy commented Sep 11, 2020

Hi again,

in the use of this wonderful implementation of your program, I spotted a behavior which someone (depends on who), could consider erratic.

Of course, it is comprehensible that the management of this letter represents a challenge as it arise mostly of foreign terms. Nevertheless, in the cases that I will present here, I believe there is an easy, simple and non-debatable way of syllabify these words, even in the case where the pronunciation is borrowed from other language too.

Note: DRAE stands for Diccionario de la Real Academia Española (some of the terms have an alternative writing adapted to the Spanish language, like bypass (baipás) or curry (curri) ).

First we have the errors raised when the 'y' is in the middle of two consonants:

pylabeador.syllabify('Tytonidae') #Family of birds, could have use Tyto too
Traceback (most recent call last):
...
raise HyphenatorError("Nucleus expects a vowel!", word)
...

pylabeador.syllabify('bypass') #Appears in DRAE
raise HyphenatorError("Nucleus expects a vowel!", word)

pylabeador.syllabify('byroniano') #Appears in DRAE
raise HyphenatorError("Nucleus expects a vowel!", word)

On the other hand we have (this one, I reckon, could use a debate)

pylabeador.syllabify('byte') #Appears in DRAE
raise HyphenatorError("Nucleus expects a vowel!", word)

Yes, but one may think that only one syllable pronounced like 'bait'
(in Spanish), is the correct way to proceed. My personal point of view is that this 'y' should behave like an 'i' and in Spanish, two syllables arises.

Second, its behaviour when proceeded of 'rr':

pylabeador.syllabify('curry') #Appears in DRAE
['cur', 'ry']

Which, I do not know, could be related to...

Third:

pylabeador.syllabify('cónyuge') #Appears in DRAE, very common word
['có', 'nyu', 'ge']

It is interesting that this one, possibly due to the presence of two strong vowels 'oa', works fine since the hyphenation was broken in such a way that it matches a vowel each consonant (d and y):

pylabeador.syllabify('coadyuvar') #Appears in DRAE
['co', 'ad', 'yu', 'var']

I excuse myself for not providing better explanations of the observed behaviour. I hope that this examples could be helpful for you.

Thank you very much for your work,
M.

@jdevera
Copy link
Owner

jdevera commented Sep 24, 2020

Wow, nice findings, which I somehow managed to miss notifications for. The handling of the letter Y is a rather flaky topic in the algorithm that I followed. This might actually warrant another attempt from me to contact the people behind that algorithm, who now run an online service that seems to get all these cases right: https://tulengua.es/silabas/

I hope this is not blocking your work, as I will not be able to work on this very soon.

@jdevera jdevera added the bug Something isn't working label Sep 24, 2020
@MdeHpy
Copy link
Author

MdeHpy commented Sep 25, 2020

I just checked the online service that you pointed to me. While most of these cases are well handled, I still find that their tool splits ''curry" as cur-ry. As a consequence, an error in the underlying algorithm, and not only in implementation, cannot be discarded.

Don't worry about the time concerning the fixing. I use your wonderful tool as a part for a side-personal-hobby project (non-profit of course) for which I just need the tool to syllabify more o less correctly, I can definitely carry on with this issue. Nevertheless, I like to report errors so that the tool could be improved for future uses, as well to let the authors be aware of possible bugs.

I understand that the 'y' case is very tricky, so thank you again for your work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants