-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question particles #458
Comments
Incidentally, WALS has a chapter on polar questions: http://wals.info/chapter/116 |
Thanks for reopening this discussion. I agree we need to do something, perhaps not only about questions but about sentence mood in general. |
I agree with @jnivre, I also think it's more of a sentence property. For example, English marks polar questions with auxiliary inversion, so you wouldn't want to put the polarity annotation on any particular word in that case. I've been using a sentence annotation for sentence mood (roughly following the SPAAC scheme) in an English corpus here for that reason (using Stanford Dependencies, not UD, butt the principle is the same): https://github.com/amir-zeldes/gum/blob/master/dep/GUM_interview_brotherhood.conll10 If you're looking at an isolated particle, and not something like a morphological feature on predicates, then maybe it can be seen as syntactically just a particle, and the resulting semantics for the whole sentence are polar. |
Well, I am talking about isolating style languages with dedicated question
particles. There sure are a lot of universal features, so unless there are
good criteria for excluding such features from one's analysis I think
treating such phenomena for such languages as sentence level properties
would be missing a trick.
I think we may need to hear from people working on languages such as
Mandarin Chinese.
…On May 23, 2017 6:14 PM, "Amir Zeldes" ***@***.***> wrote:
I agree with @jnivre <https://github.com/jnivre>, I also think it's more
of a sentence property. For example, English marks polar questions with
auxiliary inversion, so you wouldn't want to put the polarity annotation on
any particular word in that case. I've been using a sentence annotation for
sentence mood (roughly following the SPAAC scheme) in an English corpus
here for that reason (using Stanford Dependencies, not UD, butt the
principle is the same):
https://github.com/amir-zeldes/gum/blob/master/dep/
GUM_interview_brotherhood.conll10
If you're looking at an isolated particle, and not something like a
morphological feature on predicates, then maybe it can be seen as
syntactically just a particle, and the resulting semantics for the whole
sentence are polar.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#458 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABZOyb3H9W5asjW8d_WvxLuBuuUqJ-Fqks5r8xQGgaJpZM4Njice>
.
|
I didn't mean to imply that this was only a sentence level issue. For languages that use question particles, we definitely need appropriate features. But we should also think about how to capture sentence mood for languages that rely on other strategies such as subject-verb inversion. |
I'm part of team working on Mandarin and Cantonese, but we haven't moved on to features yet, so perhaps another team(?) working on Chinese might be able to chime in. Unless I'm mistaken though, Japanese has question particles, too, and there seems to be an active Japanese team(?). On the subject of sentence-level particles and POS/relation, we have "sentence(/utterance)-final particles" in Mandarin and Cantonese, and use Japanese v1 appears to use PART as well for their final particles, but they use the aux relation. |
@torma I definitely don't mind having more morphological features. I think for question particles in many languages this could be done mostly automatically (based on word forms and POS tags, or for Mandarin just the relevant character), but if it's done for sentences in some languages, it may make sense to standardize the existence of a sentence level annotation across languages for comparability. But this may be part of a larger conversation regarding where to annotate what in multiple languages. |
My two cents:
|
Yes, we considered adding a notion of "syntactic" or "phrase-level" features for v2, but in the end we decided against it because (a) it would have added more complexity to CoNLL-U and (b) we didn't have enough convincing use cases for it. We may have to revisit this if we get more convincing use cases. The other option would be to use subtyping on dependency relations, like "root:decl", "root:interrog", ..., "ccomp:decl", "ccomp:interrog", but I am not sure we want to go down this route. |
@martinpopel that argument makes sense, I agree. This may be off topic, but while we're on the issue of subtyping root/clause labels, this might be another argument for allowing it: I very often have utterances that are just a vocative NP (e.g. an utterance such as "Mike!"). In these cases UD validation tells me the root label has to be |
Another situation where I was tempted to subtype It would also be worth considering what to do with fragmentary utterances in dialogue that are perfectly natural linguistically. ("Where were you?" "The store." / "At the store.") |
Related work, reposting from #178: |
Hi all, I think this is actually about ellipsis in UD and whether we want to annotate it in other cases than the gapping currently supported by the guidelines. I also think this thread has moved far from the original topic of Question particles so if you wish to discuss it further, please create a new issue and copy the relevant points there. |
It seems that no documentation on question particles reached v2, contrary to the final comment closing #178. I'd like to reopen the question of annotating question particles. See also #454 on the documentation of questions.
I am working on Tibetan, which has dedicated particles that occur in both content and Y/N questions. In our existing scheme we tag them as cv.ques which means question converb. In UDv2 we can tag them as PART but with what feature? PronType=Int seems wrong since these are particles and not pronouns.
One possibility at least for polar questions would be to add a new value under the Polarity feature. For example, Polarity=Xor for "exclusive disjunction" which presents a pair of alternatives only one of which is correct. This same approach might not work for content question particles, which may differ - or may be the same as - the y/n particle in a given language.
What are others here doing?
The text was updated successfully, but these errors were encountered: