Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lemma of abbreviation #516

Open
nschneid opened this issue Dec 19, 2017 · 2 comments
Open

Lemma of abbreviation #516

nschneid opened this issue Dec 19, 2017 · 2 comments

Comments

@nschneid
Copy link
Contributor

Spawned off of #513 and UniversalDependencies/UD_English-EWT#40.

I proposed:

Another issue relevant here is abbreviations. For uncommon abbreviations/shortened forms (like w for with, btwn for btwn, thru for through), I'm inclined to say we should use the canonical spelling in the lemma and apply the feature Abbr=Yes. For common abbreviations like vs. for versus and etc. for et cetera, perhaps we should keep the surface form in the lemma.

There has been further discussion about single-token abbreviations that would expand to multiple words, and whether to expand frequent single-word abbreviations.

@dan-zeman
Copy link
Member

Possibly related old issues:
#112
#181

@dan-zeman dan-zeman added this to the v2.2 milestone Apr 24, 2018
@sanjmeh
Copy link

sanjmeh commented May 12, 2018

I want to reiterate this problem of short abbreviation tagging.
The classic example is the short form vs. or just v. used in most legal text instead of the full word versus.
On annotating the text using udpipe english_ewt model it takes the period inside the token (but still isnt able to lemmatize it to VERSUS while the english_partut treats the period as a separate token and abruptly ends the sentence. So we have a problem here that may be serious enough for legal text.

@dan-zeman dan-zeman modified the milestones: v2.2, v2.4 Nov 13, 2018
@dan-zeman dan-zeman modified the milestones: v2.4, v2.5 Oct 6, 2019
@dan-zeman dan-zeman modified the milestones: v2.5, v2.6 Nov 9, 2019
@dan-zeman dan-zeman modified the milestones: v2.6, v2.7 May 14, 2020
@dan-zeman dan-zeman modified the milestones: v2.7, v2.8 Nov 14, 2020
@dan-zeman dan-zeman modified the milestones: v2.8, v2.9 Jun 17, 2021
@dan-zeman dan-zeman modified the milestones: v2.9, v2.11 Jun 13, 2022
@dan-zeman dan-zeman modified the milestones: v2.11, v2.13 May 31, 2023
@dan-zeman dan-zeman modified the milestones: v2.13, v2.14 Nov 15, 2023
@dan-zeman dan-zeman modified the milestones: v2.14, v2.15 May 15, 2024
@dan-zeman dan-zeman removed this from the v2.15 milestone Nov 16, 2024
@dan-zeman dan-zeman added this to the v2.16 milestone Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants