Question particles #458

heacu · 2017-05-23T11:16:52Z

It seems that no documentation on question particles reached v2, contrary to the final comment closing #178. I'd like to reopen the question of annotating question particles. See also #454 on the documentation of questions.

I am working on Tibetan, which has dedicated particles that occur in both content and Y/N questions. In our existing scheme we tag them as cv.ques which means question converb. In UDv2 we can tag them as PART but with what feature? PronType=Int seems wrong since these are particles and not pronouns.

One possibility at least for polar questions would be to add a new value under the Polarity feature. For example, Polarity=Xor for "exclusive disjunction" which presents a pair of alternatives only one of which is correct. This same approach might not work for content question particles, which may differ - or may be the same as - the y/n particle in a given language.

What are others here doing?

heacu · 2017-05-23T11:19:42Z

Incidentally, WALS has a chapter on polar questions: http://wals.info/chapter/116

jnivre · 2017-05-23T12:16:28Z

Thanks for reopening this discussion. I agree we need to do something, perhaps not only about questions but about sentence mood in general.

amir-zeldes · 2017-05-23T17:14:42Z

I agree with @jnivre, I also think it's more of a sentence property. For example, English marks polar questions with auxiliary inversion, so you wouldn't want to put the polarity annotation on any particular word in that case. I've been using a sentence annotation for sentence mood (roughly following the SPAAC scheme) in an English corpus here for that reason (using Stanford Dependencies, not UD, butt the principle is the same):

https://github.com/amir-zeldes/gum/blob/master/dep/GUM_interview_brotherhood.conll10

If you're looking at an isolated particle, and not something like a morphological feature on predicates, then maybe it can be seen as syntactically just a particle, and the resulting semantics for the whole sentence are polar.

heacu · 2017-05-23T17:58:08Z

Well, I am talking about isolating style languages with dedicated question particles. There sure are a lot of universal features, so unless there are good criteria for excluding such features from one's analysis I think treating such phenomena for such languages as sentence level properties would be missing a trick. I think we may need to hear from people working on languages such as Mandarin Chinese.

…

On May 23, 2017 6:14 PM, "Amir Zeldes" ***@***.***> wrote: I agree with @jnivre <https://github.com/jnivre>, I also think it's more of a sentence property. For example, English marks polar questions with auxiliary inversion, so you wouldn't want to put the polarity annotation on any particular word in that case. I've been using a sentence annotation for sentence mood (roughly following the SPAAC scheme) in an English corpus here for that reason (using Stanford Dependencies, not UD, butt the principle is the same): https://github.com/amir-zeldes/gum/blob/master/dep/ GUM_interview_brotherhood.conll10 If you're looking at an isolated particle, and not something like a morphological feature on predicates, then maybe it can be seen as syntactically just a particle, and the resulting semantics for the whole sentence are polar. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#458 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABZOyb3H9W5asjW8d_WvxLuBuuUqJ-Fqks5r8xQGgaJpZM4Njice> .

jnivre · 2017-05-23T22:05:02Z

I didn't mean to imply that this was only a sentence level issue. For languages that use question particles, we definitely need appropriate features. But we should also think about how to capture sentence mood for languages that rely on other strategies such as subject-verb inversion.

ermanh · 2017-05-24T04:21:49Z

I'm part of team working on Mandarin and Cantonese, but we haven't moved on to features yet, so perhaps another team(?) working on Chinese might be able to chime in. Unless I'm mistaken though, Japanese has question particles, too, and there seems to be an active Japanese team(?).

On the subject of sentence-level particles and POS/relation, we have "sentence(/utterance)-final particles" in Mandarin and Cantonese, and use PART and discourse:sp (sp for sentence particle) [discourse:sp in Mandarin and Cantonese]. In these two languages at least, they are fixed in final position unlike adverbs or auxiliaries, and their functions range from interrogative mood to epistemic modality, speech act, and pragmatic deixis. We put them under discourse under a broader interpretation of the word -- it's stretching it for sure, and not a full overlap, but we're not sure there's a better alternative at the moment.

Japanese v1 appears to use PART as well for their final particles, but they use the aux relation.

amir-zeldes · 2017-05-24T14:17:32Z

@torma I definitely don't mind having more morphological features. I think for question particles in many languages this could be done mostly automatically (based on word forms and POS tags, or for Mandarin just the relevant character), but if it's done for sentences in some languages, it may make sense to standardize the existence of a sentence level annotation across languages for comparability. But this may be part of a larger conversation regarding where to annotate what in multiple languages.

martinpopel · 2017-05-25T09:37:10Z

My two cents:

I guess what is needed in most cases is not a sentence-level annotation, but a clause-level annotation (obviously: stored in the head word). For example, in He asked "How are you?" and didn't wait for a reply. we cannot mark the whole sentence as interrogative.
(Even if a sentence-level annotation is needed for some phenomena, we would need a better CoNLL-U specification for storing arbitrary/predefined key-value pairs in comments. This is a technical issue, so I don't want to go into the details here.)
The FEATS column was originally intended for a) "inflectional" features present in the morphology of a given word form (possibly disambiguated), b) "lexical" features inherent in a given lemma (I guess this is the case of the question particles discussed above). There is a third possibility c) grammatical features which are not present in the morphology of a given word, but can be inferred from the context, for example the periphrastic tense of complex verb forms, or the interrogative "mood" of verbs in polar questions in languages which mark it only with word order. I am not against this c, but I think it fits a deeper layer of language description and it should be discussed first whether to allow such features in current UD.

jnivre · 2017-05-25T09:55:36Z

Yes, we considered adding a notion of "syntactic" or "phrase-level" features for v2, but in the end we decided against it because (a) it would have added more complexity to CoNLL-U and (b) we didn't have enough convincing use cases for it. We may have to revisit this if we get more convincing use cases. The other option would be to use subtyping on dependency relations, like "root:decl", "root:interrog", ..., "ccomp:decl", "ccomp:interrog", but I am not sure we want to go down this route.

amir-zeldes · 2017-05-25T13:43:41Z

@martinpopel that argument makes sense, I agree. This may be off topic, but while we're on the issue of subtyping root/clause labels, this might be another argument for allowing it:

I very often have utterances that are just a vocative NP (e.g. an utterance such as "Mike!"). In these cases UD validation tells me the root label has to be root, but I always feel uneasy that this isn't vocative. Should there be root:vocative? I'm happy to open another issue if this sounds relevant.

nschneid · 2017-05-25T13:57:37Z

Another situation where I was tempted to subtype root: sentences truncated due to length limits of the medium, often indicated with an ellipsis: "I was asking him whether...". This situation could justify root:incomplete or similar, under the logic that no sentence type has been established because there is not a complete sentence.

It would also be worth considering what to do with fragmentary utterances in dialogue that are perfectly natural linguistically. ("Where were you?" "The store." / "At the store.")

nschneid · 2017-05-25T14:00:45Z

Related work, reposting from #178:

Friedrich (@annefried) et al., Situation entity types: automatic classification of clause-level aspect
Zeldes (@amir-zeldes) & Simonson (@thedansimonson), Different Flavors of GUM: Evaluating Genre and Sentence Type Effects on Multilayer Corpus Annotation Quality

dan-zeman · 2017-05-25T14:37:36Z

Hi all,

I think this is actually about ellipsis in UD and whether we want to annotate it in other cases than the gapping currently supported by the guidelines.

I also think this thread has moved far from the original topic of Question particles so if you wish to discuss it further, please create a new issue and copy the relevant points there.

amir-zeldes · 2017-05-25T15:33:18Z

OK, thanks, I've opened an issue for the vocative question here: #459

If there's interest in discussing sentence types, we can open another issue or talk about it here (it seems to have come up regarding question particles already in #178)

dan-zeman added standard needed UPOS Universal part-of-speech tags: definitions and examples dependencies universal labels Apr 24, 2018

dan-zeman added this to the v2.2 milestone Apr 24, 2018

dan-zeman modified the milestones: v2.2, v2.4 Nov 13, 2018

dan-zeman modified the milestones: v2.4, v2.5 Oct 6, 2019

dan-zeman modified the milestones: v2.5, v2.6 Nov 9, 2019

dan-zeman modified the milestones: v2.6, v2.7 May 14, 2020

dan-zeman modified the milestones: v2.7, v2.8 Nov 14, 2020

dan-zeman mentioned this issue Nov 14, 2020

Question particle and deprel #738

Open

dan-zeman modified the milestones: v2.8, v2.9 Jun 17, 2021

dan-zeman modified the milestones: v2.9, v2.11 Jun 13, 2022

dan-zeman modified the milestones: v2.11, v2.13 May 31, 2023

dan-zeman modified the milestones: v2.13, v2.14 Nov 15, 2023

dan-zeman modified the milestones: v2.14, v2.15 May 15, 2024

dan-zeman modified the milestones: v2.15, v2.16 Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question particles #458

Question particles #458

heacu commented May 23, 2017

heacu commented May 23, 2017

jnivre commented May 23, 2017

amir-zeldes commented May 23, 2017

heacu commented May 23, 2017 via email

jnivre commented May 23, 2017

ermanh commented May 24, 2017 •

edited

Loading

amir-zeldes commented May 24, 2017

martinpopel commented May 25, 2017

jnivre commented May 25, 2017

amir-zeldes commented May 25, 2017

nschneid commented May 25, 2017 •

edited

Loading

nschneid commented May 25, 2017

dan-zeman commented May 25, 2017

amir-zeldes commented May 25, 2017

Question particles #458

Question particles #458

Comments

heacu commented May 23, 2017

heacu commented May 23, 2017

jnivre commented May 23, 2017

amir-zeldes commented May 23, 2017

heacu commented May 23, 2017 via email

jnivre commented May 23, 2017

ermanh commented May 24, 2017 • edited Loading

amir-zeldes commented May 24, 2017

martinpopel commented May 25, 2017

jnivre commented May 25, 2017

amir-zeldes commented May 25, 2017

nschneid commented May 25, 2017 • edited Loading

nschneid commented May 25, 2017

dan-zeman commented May 25, 2017

amir-zeldes commented May 25, 2017

ermanh commented May 24, 2017 •

edited

Loading

nschneid commented May 25, 2017 •

edited

Loading