Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ancient Greek Relation Subtypes #958

Open
mr-martian opened this issue Jul 22, 2023 · 18 comments
Open

Ancient Greek Relation Subtypes #958

mr-martian opened this issue Jul 22, 2023 · 18 comments

Comments

@mr-martian
Copy link
Contributor

Currently, Ancient Greek has the following subtypes enabled:

advcl:cmp, advmod:emph, aux:pass, csubj:pass, flat:foreign, flat:name, nsubj:outer, nsubj:pass, obl:agent, obl:arg

In PTNK, I have additionally made use of the following:

   2647 nmod:poss
    468 acl:relcl
     91 obl:tmod
     32 obl:npmod
     12 cc:preconj
      2 cop:outer
      2 advcl:relcl

Should I document these or should I reduce some or all of them to the non-subtyped relation?

@dan-zeman
Copy link
Member

@daghaug @gcelano

@Stormur
Copy link
Contributor

Stormur commented Jul 24, 2023

If I can add my 2 cents, however coming from my experience with Latin (if harmonisation with it has some importance), I could comment:

  • nmod:poss: we tried to apply it, but it is extremely difficult to do so and one wonders about its significance (so, it is not used currently in Latin)
  • acl:relcl: this is de facto mandatory
  • obl:tmod: I think this is useful and it can be defined reasonably (however, one has to think about its possible conflict with obl:arg). We are now using it regularly.
  • obl:npmod: we are not using it and I sincerely do not think it makes sense, especially not in a language with case inflection. Also, from the little I glimpse, sometimes it seems to be rather an advmod/advcl.
  • cc:preconj: I do not think this has any meaning at all since it is directly retrievable from the linear order of tokens. Latin is not using it (for the same reason that it is not using features such as AdpType).
  • advcl:relcl: this absolutely needs a documentation defining it, because as of now we have two different relations using this label. In Latin it is for free relatives; in English it is for "sentence relatives", which Latin currently treats by means of advcl:pred.

@nschneid
Copy link
Contributor

  • advcl:relcl: this absolutely needs a documentation defining it, because as of now we have two different relations using this label. In Latin it is for free relatives; in English it is for "sentence relatives", which Latin currently treats by means of advcl:pred.

English also uses advcl:relcl for free relatives where the WH word is an adverb, e.g. "I looked where you were sitting": advcl:relcl(where, sitting).

@Stormur
Copy link
Contributor

Stormur commented Jul 24, 2023

So there are two concurring uses of advcl:relcl? And what for "non-adverbial" relative words?

@mr-martian
Copy link
Contributor Author

So, how I'm currently using them:

  • nmod:poss this is essentially "is the dependent Case=Gen or Poss=Yes?" which, yeah, not that helpful
  • obl:npmod my starting point was word aligning with the Hebrew treebank (which uses this for the infinitive absolute) and projecting, so this relation is present in places where the Septuagint copies the Hebrew construction of reduplicating the verb
  • cc:preconj this is currently used for τε, but not for sentence-initial conjunctions, so I agree it's probably not useful
  • advcl:relcl I copied the English usage on this one

@nschneid
Copy link
Contributor

So there are two concurring uses of advcl:relcl? And what for "non-adverbial" relative words?

I've updated the docs to explain this more clearly: https://github.com/UniversalDependencies/docs/blob/pages-source/_en/dep/advcl-relcl.md

(The page on the site isn't updating for some reason)

@amir-zeldes
Copy link
Contributor

obl:npmod: we are not using it and I sincerely do not think it makes sense

It is an odd label linguistically, to be sure, but if you want to use obl:tmod, then I think you will probably need obl:npmod as well. The tmod label is used for temporal noun phrases used adverbially, as in 1. When a similar phrase describes a non-temporal quantity, you need some kind of label, and that's what obl:npmod does:

  1. Let's meet next week/obl:tmod
  2. Let's meet the way/obl:npmod we planned originally

It has been pointed out that obl:tmod isn't really a syntactic category but more of a semantic subtype, so in a way obl:npmod subsumes it and I suppose it would basically cover accusativus graecus.

cc:preconj: I do not think this has any meaning at all since it is directly retrievable from the linear order of tokens

I think this is not 100% true, but realistically you are right that it is mostly predictable. Hypothetically you could get something like "I arrived and/cc then both/cc:preconj danced and/cc sang", where it's not totally obvious what would be cc:preconj. That said, even when it is trivial, it's sometimes nice to be able to easily find all cases that have a cc:preconj, and it's easy enough to do, so why not?

nmod:poss this is essentially "is the dependent Case=Gen or Poss=Yes?" which, yeah, not that helpful

That may be true, but it might still be nice for comparability to other languages which use nmod:poss.

@Stormur
Copy link
Contributor

Stormur commented Jul 24, 2023

So, how I'm currently using them:

* `nmod:poss` this is essentially "is the dependent `Case=Gen` or `Poss=Yes`?" which, yeah, not that helpful

I thought of it in part as the difference between subjective/objective genitive (e.g., for the wider public, amor matris 'the love for the mother vs. the love from the mother', both expressed by the genitive), but then I am not sure we can label the subjective one as "possessive"; probably this pertains at some level of reference annotation? Given an nmod relation, the feature Poss=Yes should not change this picture.


* `obl:npmod` my starting point was word aligning with the Hebrew treebank (which uses this for the infinitive absolute) and projecting, so this relation is present in places where the Septuagint copies the Hebrew construction of reduplicating the verb 

But then, is it still related to Latin? 🤔


obl:npmod: we are not using it and I sincerely do not think it makes sense

It is an odd label linguistically, to be sure, but if you want to use obl:tmod, then I think you will probably need obl:npmod as well. The tmod label is used for temporal noun phrases used adverbially, as in 1. When a similar phrase describes a non-temporal quantity, you need some kind of label, and that's what obl:npmod does:

1. Let's meet next week/obl:tmod

2. Let's meet the way/obl:npmod we planned originally

It has been pointed out that obl:tmod isn't really a syntactic category but more of a semantic subtype, so in a way obl:npmod subsumes it and I suppose it would basically cover accusativus graecus.

It is, as many others, and for this reason it appears only as a subtype. Many subtypes (most?) are semantic, even relcl is in some sense (there just happen to be a reference to something in the matrix clause).

We are using it "transversally", so it also appears for advmod.

I do not think that tmod and npmod are related exactly for this reason: with regard to "adverbiality", this is already subsumed under UD's use of he oblique obl relation; so tmod is purely semantic, or let's say lexical, in that it depends either on the word (e.g. semper 'always') or on the predicate (e.g. vivo 'to live' with some argument denoting an event). I am not sure why it should cover accusativus graecus if this is already covered by obl (in its current interpretation) and if the purely syntactical fact of not being introduced by an element like an adposition is self-evident: what I mean is that a simple treebank query directly retrieves such cases.

In the example

2. Let's meet the way/obl:npmod we planned originally

I do not see what it is adding. It is already obl, and the fact it appears as such without a preposition is probably lexically determined, so maybe it should be annotated at a token level. If np stays for noun phrase, it is stating the obvious, as an oblique is already intended to be one.


cc:preconj: I do not think this has any meaning at all since it is directly retrievable from the linear order of tokens

I think this is not 100% true, but realistically you are right that it is mostly predictable. Hypothetically you could get something like "I arrived and/cc then both/cc:preconj danced and/cc sang", where it's not totally obvious what would be cc:preconj. That said, even when it is trivial, it's sometimes nice to be able to easily find all cases that have a cc:preconj, and it's easy enough to do, so why not?

Hm... this might one further reason to tinker with UD's annotation of co-ordinations 🤔 I admit this still does not convince me totally about the usefulness of this subrelation instead of moot redundancy for a very functional relation...


nmod:poss this is essentially "is the dependent Case=Gen or Poss=Yes?" which, yeah, not that helpful

That may be true, but it might still be nice for comparability to other languages which use nmod:poss.

True, but then we need a clear definition which as of now does not seem to be there. There is probably also an overlap with det... or also just with the fact of a PronType=Prs depending as nmod?

@Stormur
Copy link
Contributor

Stormur commented Jul 24, 2023

So there are two concurring uses of advcl:relcl? And what for "non-adverbial" relative words?

I've updated the docs to explain this more clearly: https://github.com/UniversalDependencies/docs/blob/pages-source/_en/dep/advcl-relcl.md

(The page on the site isn't updating for some reason)

I am now wondering if these are not or are indeed two different phenomena. I am sincerely confused.

... but is the subclause in I looked where you were sitting not rather an object of the main verb? I would instead think of somethong like Go back whence you came (correct?).

@nschneid
Copy link
Contributor

An adverb can't be a direct object in UD, right? I think an obj has to be a nominal.

(I agree the location phrase is a complement/argument of "look" here, but that's not what UD cares about.)

@amir-zeldes
Copy link
Contributor

It is already obl

Yes, obl:npmod and obl:tmod are subtypes of obl, so that part is natural. In many datasets, including English but also others such as Hebrew or Coptic, the plain obl is used specifically for prepositional phrases. I suspect it was originally a conversion remnant from Stanford Dependencies, which distinguished prep from npadvmod and tmod. These became the prototypes for nmod/obl, obl:npmod and obl:tmod.

Of course, the subtypes are totally optional, but that is the background for why all adverbial NPs (usually with some kind of spatiotemporal or extent semantics) have a subtype in languages that use them. So if you are using :tmod, I would also expect to see :npmod for non-temporal phrases. TBH if I were designing UD from scratch I would have just called such NPs advmod too, since that is essentially what accusativus graecus is, but advmod is prohibited on things not tagged ADV, so we have to use some kind of obl relation - the subtype is just to keep them separate from PP modifiers.

@Stormur
Copy link
Contributor

Stormur commented Jul 25, 2023

An adverb can't be a direct object in UD, right? I think an obj has to be a nominal.

(I agree the location phrase is a complement/argument of "look" here, but that's not what UD cares about.)

I was perhaps confused by the fact that look is intransitive in English. But I missed the more important fact that where is "promoted" in the matrix clause. But if this is the case, I do not understand why, keeping advmod(look,where), you were sitting is not just acl:relcl as the "expansion" of where.

Probably I see where this is coming from: an ADV entails an advcl (propositional) and not an acl. But I do not know if this is not accepted by UD/the validator (and actually, this is one further case showing that where is not an "adverb", but a kind of pro-form). Still, another annotation strategy solving it would be to have where you were sitting as a whole as advcl:relcl of look, and then this use of advcl:relcl would be the same as for Latin. But I know the treebanks treat "free relatives" differently.


So if you are using :tmod, I would also expect to see :npmod for non-temporal phrases

Sorry if I am firm about this, but no. There is no logical relation. This all comes from some language-specific logics projected universally. Especially for Latin and Ancient Greek (and many other languages), there is nothing special about prepositionless arguments, as prepositions are just in alternation with Case.

I understand where this comes from, but I see (universally) more sense in a (hypothetic) semantic obl:manner for the way rather than a mechanical obl:npmod.

As for accusativus graecus, one might still envision an adv* annotation, but with advcl, maybe as advcl:pred (by the way, I personally think it is still left to be convincingly proven that accusativus graecus is really an adverbial rather than a second object... but this is another story). Anyway, the relation obl already means (or at least covers) something like "nominal adverbial": then the more meaningful subtype to be used here is arg, to keep track of a parallel complement/adjunct distinction, if this is what an advmod label would imply.

@amir-zeldes
Copy link
Contributor

more sense in a (hypothetic) semantic obl:manner for the way rather than a mechanical obl:npmod.

Sure, that would be perfectly logical and seems fine to me. npmod is just the underspecified one (not saying if it's manner, or extent or something else). I don't much like the label either (no NPs in dependencies), it's just a legacy thing from SD.

As for accusativus graecus, one might still envision an adv* annotation, but with advcl, maybe as advcl:pred

Not if it's not a clause - then I would have expected (and wanted) advmod, but that is forbidden for nouns, and I lost that battle long ago ;)

there is nothing special about prepositionless arguments ... Anyway, the relation obl already means (or at least covers) something like "nominal adverbial"

Yes, that's all correct and UD takes that position explicitly in having obl be the main label for cases with and without prepositions. It's just that in some languages maintainers like to make that distinction, so they use subtypes - these are in no way mandatory. I think if you are using "tmod" also for adverbs and phrases with prepositions, and not using a subtype for other domains, it just ends up being different from how other languages use that subtype. But maybe that's OK - I was just pointing it out, since that subtype comes from UD English and is used differently there.

@Stormur
Copy link
Contributor

Stormur commented Jul 25, 2023

But maybe that's OK - I was just pointing it out, since that subtype comes from UD English and is used differently there.

Hm, I have to look into it. But reading from the scant documentation, we seem to be in line. I do not see differences... it is simply independent from adpositions, even in English (judging from the examples in the documentation). tmod itself as a label might come from UD English, but "time complements" are universal...

We also use lmod. I fear other domains would be less defined and more problematic than these ones. Besides, I have not noticed attested relation subtypes for them, apart from subsubtypes of time and place.


As for accusativus graecus, one might still envision an adv* annotation, but with advcl, maybe as advcl:pred

Not if it's not a clause - then I would have expected (and wanted) advmod, but that is forbidden for nouns, and I lost that battle long ago ;)

It might be a nominal clause. But I agree that it would be a lectio difficillima (a 'very difficult interpretation'), not even truly justified. So, currently, obl still is the best (traditional) option.

@amir-zeldes
Copy link
Contributor

it is simply independent from adpositions, even in English

I think that might be ambiguous - just to clarify, in UD English and related datasets following its practices, :tmod only occurs when there is no preposition

So, currently, obl still is the best (traditional) option.

Agreed!

@mr-martian
Copy link
Contributor Author

I don't think this is actually resolved. I've been stripping subtypes from my treebank in the process of pushing to the UD repo, but I'd still like to actually include them.

@mr-martian mr-martian reopened this Nov 15, 2023
@dan-zeman dan-zeman modified the milestones: v2.13, v2.14 Nov 15, 2023
@dan-zeman
Copy link
Member

OK, but then it needs a new milestone. v2.13 is over.

@dan-zeman dan-zeman modified the milestones: v2.14, v2.15 May 15, 2024
@dan-zeman
Copy link
Member

I don't think this is actually resolved. I've been stripping subtypes from my treebank in the process of pushing to the UD repo, but I'd still like to actually include them.

@mr-martian If the maintainers of the other Ancient Greek treebanks did not say anything against your subtypes during the 16 months, then I think we can conclude that they do not object :-) The problem may be updating the other treebanks to have the subtypes too. Maybe not for Perseus; but PROIEL can be overwritten in the future with a new conversion from the upstream dataset. But I think you can document the subtypes so that you can use them. (And then close this issue.)

@dan-zeman dan-zeman modified the milestones: v2.15, v2.16 Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants