-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please file issues for insufficiencies of lingpy #16
Comments
True! The difference is that LingPy's I'm fully aware that what I use here is quite a dirty and ad-hoc way to do what I wanted to do, it is supposedly only intermittent („Nichts währt länger als eine kurze provisorische Lösung“, though). I'll hopefully think of something better and suggest it to you at some point – I assume it would be an optional argument to |
Ah, I see, this is of course a feature rather than a bug, as dots serve as vowel break markers and I don't see why to keep them, although one could modify to keep the dot. We have even new annotations, which allow to keep original stuff but will convert parts, using a "source/target" annotation, which would allow to mark laryngeal in IE, h₂ as h₂/ə, meaning: lingpy will read it as schwa, while the segment is still laryngeal 2. We now also use clear-cut orthography profiles to convert from orthography to ipa-like representations. I think as far as this repo is concerned, it would be useful to have a larger discussion on that, so you know where we are right now and may explain us why you might want to diverge from that. In terms of implementation, the dot may be hard-coded, but one needs to look up the original code. In fact, you can pretty much adapt ipa2tokens to many, many of your needs, and I think the tutorial online, that is, where the function is described, may even give further instructions. If not, let me know, and I'll explain some more about the basic ideas behind it. |
BTW, on cldf, I recommend this page, as it is where I will develop the major specifications/recommendations which are usually on-line with what lingpy/edictor handle. |
Thanks! That's helpful. |
I just figured when reading the documentation of an extension of ipa2tokens in this repo that you suppose that linpgy splits strings that are not identical with the input strings when removing whitespace. If this is really happening, it should be handled from within lingpy, and I would need some triggers to confirm. Note that you should make sure to normalize to one unicode version, as we do in lingpy, and that this may trigger differences (currently, you are not normalizing in the script!). Other reasons I would not know of, but it would be extremely valuable to be told those differences, so we can address them.
The text was updated successfully, but these errors were encountered: