Resolving adposition spelling variants #51

nschneid · 2018-06-22T21:33:03Z

Each adposition has a citation form and may have variants. For example, the citation form for all possessives is 's, and possessive pronouns need to be linked to this. "Toward"/"towards" and "out of"/"outta" may be considered conventionalized variant spellings. Moreover, annotated sentences may have adpositions with nonstandard spellings or capitalization. The p Markdown macro thus needs to be able to link to an adposition whose canonical name is different from the one used in the sentence.

Proposed solution:

Extend the p macro to include a display spelling that differs from the canonical lemma: [p my en/'s] or [p my en/'s Possessor]. [p en/'s] would continue to work and be equivalent to [p 's en/'s].
To avoid verbosity in the Markdown for standard possessive pronouns and other spelling variants, [p en/my] and similar should, in the absence of a matching citation form, search for a match in the other_forms field. If exactly one is found, the link will point to that adposition.

The text was updated successfully, but these errors were encountered:

ablodge · 2018-07-02T19:47:54Z

1. is accomplished by the macro pspecial as in [pspecial en/'s my Possessor]. I added a new macro to avoid further overloading p.
1. is a bit harder, but is currently implemented as function Adposition.normalized_adp() with hardcoded lists of standard spelling variants. (I have implementation of the same function using other_forms in comments.) The trouble is the list of spelling variants has to be hardcoded somewhere before Adpositions and PTokenAnnotations can be imported. Otherwise the script won't know where to send each Adposition foreign key.

nschneid · 2018-07-03T00:53:14Z

OK, I guess my assumption was that adpositions with other_forms would be manually created before importing any tokens. Should we support deleting an adposition after it's been created and merging its tokens with another adposition?

ablodge · 2018-07-03T14:43:54Z

I can do it by creating them manually. Is that convenient for Hebrew and other languages where adpositions have more than one variant?

nschneid · 2018-07-03T15:13:36Z

If you're talking about pronominal inflections in Hebrew, those are not reflected in the lemma, so we're OK as long as long as the text has been morphologically processed.

In English we can't rely on lemmas for possessives because the current UD policy is super weird: UniversalDependencies/docs#517

ablodge added a commit that referenced this issue Jul 2, 2018

#51 add adposition spelling variants

216e1ed

ablodge added a commit that referenced this issue Jul 2, 2018

#51 update

0ed4264

ablodge closed this as completed Jul 2, 2018

ablodge mentioned this issue Jul 3, 2018

Implement Adposition spelling variants using field other_forms #57

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolving adposition spelling variants #51

Resolving adposition spelling variants #51

nschneid commented Jun 22, 2018

ablodge commented Jul 2, 2018 •

edited

Loading

nschneid commented Jul 3, 2018

ablodge commented Jul 3, 2018 •

edited

Loading

nschneid commented Jul 3, 2018

Resolving adposition spelling variants #51

Resolving adposition spelling variants #51

Comments

nschneid commented Jun 22, 2018

ablodge commented Jul 2, 2018 • edited Loading

nschneid commented Jul 3, 2018

ablodge commented Jul 3, 2018 • edited Loading

nschneid commented Jul 3, 2018

ablodge commented Jul 2, 2018 •

edited

Loading

ablodge commented Jul 3, 2018 •

edited

Loading