Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New enhanced dependencies - Propagation of nsubj for ccomp and advcl in pro-drop languages #1038

Open
MagaliDuran opened this issue Jun 10, 2024 · 3 comments

Comments

@MagaliDuran
Copy link

The propagation of ccomp and advcl subjects are not part of the officially approved guidelines. However, in pro-drop languages such as Portuguese (which allow the ellipsis of the subject since the person is marked in the verb form), ccomp and advcl dependents may present an elliptical subject which may be recovered in ccomp and advcl heads.
The annotation of these subjects in the enhanced dependencies is of great importance, especially in order not to interrupt chains of propagation of subjects as in the example:

Portuguese (1 explicit subject, 5 enhanced subjects):
Ele disse que aposentará em 2025 e que pretende viajar muito enquanto estiver saudável e tiver dinheiro.
=>“Ele” is the explicit nsubj of “disse” and the enhanced subject of “aposentará”, “pretende”, “viajar” (xsubj), “saudável” and “tiver”.

English (4 explicit subjects, 2 enhanced subjects):
He said that he will retire in 2025 and that he intends to travel a lot as long as he is healthy and has money.
=> “He” is the explicit nsubj of “said”, “retire”, “intends”, “healthy” and the enhanced subject of “travel” (xsubj) and “has”.

However, if we don't propagate the subject of ccomp, the Portuguese sentence won't have a subject available for the approved enhanced relations (conj dependents and xcomp subject). Furthermore, if we don't propagate the subject of advcl, Portuguese won't have 6 subjects as in the equivalent English sentence.
This puts the pro drop languages in an unequal situation in relation to those that don't admit subject ellipsis.

Note: we have already outlined the rules for automatically propagating the subject of ccomp and advcl. Propagation will only occur if:

  • the head of ccomp or advcl has a subject (explicit or assigned by another enhanced dependency);
  • the dependent of ccomp or advcl does not have its own subject;
  • the dependent of ccomp or advcl is not an impersonal verb;
  • the dependent of ccomp or advcl has the features Person and Number with values identical to those of their respective heads (these features may be in aux, aux:pass or cop, depending on the construction).

Could you please consider approving these enhanced relations for the pro drop languages?

@nschneid
Copy link
Contributor

Thanks for bringing this up. I think it's an important discussion to have.

It is also relevant to ask whether in non-pro-drop languages like English, arguments should be propagated to advcl adjuncts. I am using "__" to stand for the inferable subject ("he was" could be inserted there):

  • While __ reading, John fell asleep.
  • John fell asleep while __ reading.
  • While __ away for the winter, John visited 3 countries.
  • John visited 3 countries while __ away for the winter.

There are participial adjuncts that are not strictly required to correspond to the matrix subject though (sometimes these are prescriptively frowned upon as "dangling participles", but with enough context they are understandable):

  • From __ eating so fast, John was afraid that Mary would choke.

Note that the matrix subject may be implicit, as in imperatives, in which case there would be nothing to propagate:

  • While __ away for the winter, make sure to secure the house against bears.

@dan-zeman
Copy link
Member

I definitely agree that the enhanced UD guidelines are worth revisiting, as they currently feel as an arbitrary selection of items while other similar phenomena are being left behind. Some time ago I actually wrote a long proposal for enhanced enhanced dependencies but we have not had time to put it on the agenda of the core group.

I think there are separate questions here. First, if we do argument propagation in control verb structures (xcomp) and relative clauses, should we do it also for other constructions, such as advcl and participles? I think we should, and this would not be specific to pro-drop languages. I am less sure about ccomp because there it is about coreference which may be clear from the semantic context but does not follow from syntax.

The other question is what to do if the shared argument does not have its node because it is only expressed by verbal morphology. Here I think we can expand the usage of empty/abstract nodes. Yes, UD has the slogan "do not annotate what is not there", especially referring to dropped pronominal subjects, but that slogan holds for the basic representation, not for the enhanced graph (otherwise we could not use abstract nodes at all, while we are currently using them for gapped predicates).

@Stormur
Copy link
Contributor

Stormur commented Jun 17, 2024

Maybe it is more generally co-reference that can be engineered in some way into enhanced guidelines? With a possibility for "external coreference" for ccomp (maybe simply left unspecified), as opposed to "necessary coreference" pointing to another element (e.g. the finite predicate) in the sentence.

I would be quite opposed to extra empty nodes, even in these cases. I think they might not even be necessary, sometimes even confusing, and I envision two scenarios:

  1. the subject is expressed morphologically somewhere: the "coreference arrow" points to the head of the involved syntagm (mostly, I would say, a verb as head of a predicate, if the subject is implicit and/or a pronoun, or to the lexical subject)
  2. the subject is not expressed at any level (this happens systematically for languages like e.g. Mongolian): the "coreference arrow" points to the head of the "finite" predicate if any, but else here subject annotation becomes a matter external to morphosyntax, so there could also be no annotation

@dan-zeman dan-zeman modified the milestones: v2.15, v2.16 Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants