Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Middle Persian: copula dependents disallowed #1054

Closed
bulbulistan opened this issue Sep 5, 2024 · 2 comments
Closed

Middle Persian: copula dependents disallowed #1054

bulbulistan opened this issue Sep 5, 2024 · 2 comments

Comments

@bulbulistan
Copy link
Contributor

A combination of main verb as participle etc. together with an analytic form of the auxiliary is analysed as a flat structure in UD, e.g.

Form Word ID Head
participle VERB grift 1 0
participle AUX ēstād 2 1
person marker of AUX i.e. COP hēnd 3 1

grift ēstād hēnd "had been[3pl] taken".

From a Middle Persian perspective, it is clear that the copula is the person marker of the auxiliary. So the better annotation, which we have applied so far, is structured, cf.:

Form Word ID Head
participle VERB grift 1 0
participle AUX ēstād 2 1
person marker of AUX i.e. COP hēnd 3 2

For now, this results in a consistency problem with UD. We would like to know whether we can keep our annotation or not.
Thank you!

@amir-zeldes
Copy link
Contributor

Generally speaking in UD, functional dependents like auxiliaries should not have children, so copulas and auxiliaries are attached as sisters to the predicate. This is true in a wide range of UD languages where morphologically, it is clear that that is not the real constituent structure, but the same can be said about the fact that the participle VERB is the root, rather than a dependent of a finite auxiliary. So in a sense, in for a penny, in for a pound! 😅

For comparison, here is how UD_English analyzes perfect progressives ("have been going") - it's clear that there is no such thing as "have going", and it's actually "have been" + "been going", but as part of UD's commitment to lexico-centrism and the promotion of cross-linguistic comparability by promotion of lexical predicates as heads, we get:

...
11	and	and	CCONJ	CC	_	15	cc	15:cc	Discourse=context-background:24->23:0:ref-dem-212-214,227-228;elaboration-additional:24->23:0:0:orp-and-221
12	I	I	PRON	PRP	Case=Nom|Number=Sing|Person=1|PronType=Prs	15	nsubj	15:nsubj	Entity=(3-person-giv:act-cf1*-1-ana)
13	have	have	AUX	VBP	Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin	15	aux	15:aux	_
14	been	be	AUX	VBN	Tense=Past|VerbForm=Part	15	aux	15:aux	_
15	working	work	VERB	VBG	Tense=Pres|VerbForm=Part	9	conj	9:conj:and	MSeg=work-ing
16	on	on	ADP	IN	_	18	case	18:case	_
17	this	this	DET	DT	Number=Sing|PronType=Dem	18	det	18:det	Entity=(5-abstract-giv:act-cf2-2-coref
18	line	line	NOUN	NN	Number=Sing	15	obl	15:obl:on	Entity=5)
19	since	since	ADP	IN	_	20	case	20:case	_
20	2019	2019	NUM	CD	NumForm=Digit|NumType=Card	15	obl	15:obl:since	Entity=(18-time-new-cf3-1-sgl)|SpaceAfter=No|XML=<date when:::"2019"></date>

This may be unsatisfactory just for English, but it make it much easier to have a uniform scheme and comparability of argument structure across a wide range of languages. It sounds like the situation for Middle Persian is the same - it's odd if you look just at that language, but I think it makes it more easy to align with other Indo-Iranian languages, or totally unrelated ones.

@dan-zeman dan-zeman added this to the v2.15 milestone Sep 6, 2024
@bulbulistan
Copy link
Contributor Author

Thank you @amir-zeldes! There are good reasons to consider Middle Persian different, at least in some aspects, but we will stick to the general rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@dan-zeman @amir-zeldes @bulbulistan and others