Skip to content

Latest commit

 

History

History
210 lines (146 loc) · 11 KB

index.md

File metadata and controls

210 lines (146 loc) · 11 KB
layout title udver
base
Old Guaraní UD
2

UD for Old Guaraní

Tokenization and Word Segmentation

  • Words are delimited by whitespace characters
  • According to typographical rules, many punctuation marks are attached to a neighboring word. They are given as separate tokens (words);
  • There are no adjectives in Old Guaraní. Modification is made by composition, juxtaposing lexical roots, so when a lexical root is modified by another a new word appears as in kuɲãporaŋ(a) 'beautiful woman' (kuɲã 'woman' + poraŋ-(a) 'beauty'). Such words are treated sometimes as multiword tokens, sometimes as a single word.
  • Some compound words from Portuguese are written as one word, e.g. santaCruz 'holy cross', espíritoSanto 'holy spirit'.

Morphology

Tags

  • Tupinambá uses 16 of the 17 universal POS categories. ADJ is not used since there is no separate class of adjectives. Stative-verbs and possessed nouns behave alike, in a way that is not possible to distinguish them morphologically (I am ugly / I have uglyness; ugly boy / boy with uglyness /(the) boy has uglyness). The fundamental distinction in Tupinambá is that between predicates and arguments (potentially referring expressions), reflected in the NOUNor VERB tags. Lexical roots in Tupinambá are predicates, which require function indicating morphology in order to function as arguments (Croft 2021)

Mapping UPOS to XPOS Old Guaraní

UPOS XPOS
ADV adv
INTJ intj
NOUN n
PROPN ppn
VERB v, vi, vt, vd
ADP pp
AUX aux
CCONJ cc
DET det
NUM num
PART pcl
PRON pro
SCONJ sc
PUNCT punct
SYM sym
X x

Nominal Features

  • Old Guaraní nouns are not marked for gender. Number is optionally marked by the lexical root etá `(be) many'.

kunumĩ 'boy / boys'
kunumĩ-etá 'boys'

  • Person indexes and pronouns are given below:
Person Possessor indexes Argument indexes Portmanteau indexes Switch-reference indexes Inependent pronouns
1.SG ʃe= a- oro wi- iʃé
2.SG ne= ere- e- ené
3 o- o- aʔe (demonstrative)
1.PL.IN jané= ja- jeré- jané
1.PL.EX oré= oro- opo oro- oré
2.PL pe= pe- peje´ pe ʔẽ

Possessor indexes, as the name suggests, only index possessors. They are marked not marked with PronType, but they are marked as Poss=Yes. Argument indexes are used with verbal predicates, as also are the portmanteau indexes (see below). Switch-reference indexes are used in dependent clauses with subjects coreferential with the subjects of the main clauses. Person indexes distinguish Number(Singular or Plural). They also distinguish Clusivity in the 1st person plural.

  • Nouns can take the following Cases: Tra, Loc, Per, Dat.
Case Ending Example
Translative -(r)amo t-uβ-amo 'as his father'
Locative -pe t-atá-pe 'in the fire'
Perlative -βo kaʔa-βo 'through the forest'
Dative -βe / -βo (with pronouns) iʃé-βo, iʃéβe 'to me'
  • The relational markers Rel, which indicate contiguity or non-contiguity between a head and its dependent, take respectively the following features: Rel=Cont and Rel=NCont. A third type or relational, Rel=HUM, indicates that obligatorily possessed noun is given without reference to a possessor at the same time marking the possessor as human. The reflexive/correferential morpheme o. which is often referred to as 'relational3' is associated with the feature-value Reflex=Yes. Another relational Rel=Humindicates that the possessor is human. These are shown below:
Rel Form(s) Example
Cont ∅ ~ r ʃe-∅-sɨ 'I have a mother'
NCont i ~ s ~ t i-sɨ-∅ 'his/her/its/their mother'
Abs t ~ m t-oʔo 'his/her/their (human) flesh'
Corf o o-sɨ-∅ 'his/her/its/their own mother'
  • Nouns may also be reduplicated in both ways denoting: plurality, collectivity, superlativity, and other semantic nuances. Numerals may also be reduplicated in order to indicate distribution.
  • Nouns are also marked for tense.
  • As an omnipredicative language, lexical roots in Tupinambá are existential predicates. In order to function as arguments, the referential marker (a ̴ ∅), is required (marked as Case=Ref) despite its function being nothing like that of nominal cases.

Verbal Features

  • The lexical root in the gerund (VerbForm=Ger) is marked as VERB even when combining with a relational.
  • Verbs are marked for aspect: Compl (completive), Iter (Iterative), Suc Successive.
  • Verbs are also marked for mood: Perm (permissive) Imp (imperative).
  • Lexical roots may be reduplicated in two differentways: monosylabic reduplication (Red=Mo), disylabic reduplication (Red=Di). The modify the aspect of the verb in different ways: disylabic reduplication indicate the repetition or duration of an action; monosylabic reduplication indicates iteration of the action.

Nominalizers

Tupinambá has many nominalizers with different functions. All but

Nominalizer function Example
(e)mi- passive deverbalizer t-emi-juka 'what is killed'.
-βaʔe relativizer o-juká-βaʔe 'the one who kills'.
-pɨr passive deverbalizer i-juká-pɨr 'the one who must be killed'
-sar agentive nominalizer juká-sar-a 'the killer'.
-saβ circunstantial nominalizer juká-saβ-a 'occasion/place/mode/instrument of killing'
-βor habitual agent juká-βor 'one who often kills'
-(a)βo gerund juká-βo 'killing'
-i ~ -w nominalized with fronted focalized adverbials juká-w

Syntax

  • As a head-marking language, core arguments, are indexed on the predicate, in the order SOV as in the example below:
aɲan
a-ɲan
1.SG-run
'I ran/run'
osepjak 
o-s-epjak
3.SG(S)-3(O)-see
He/she/it/they see her/him/it/them
  • Nominal phrases (NPs) semantically related to the core-arguments can appear in any order in relation to the predicate (where the core arguments are indexed). This is exemplified below through the sentence John sees Mary:

Johni Maryj oi-sj-epjak

Maryj Johni oi-sj-epjak

oi-sj-epjak Johni Maryj

oi-sj-epjak Maryj Johni

Maryj oi-sj-epjak Johni

Johni oi-sj-epjak Maryj

  • The dependency linking the core arguments with the NPs semantically related to the core-arguments are obl (oblique).

  • The object of transitive verbs is always indexed by Rel=Ncont. Tupinambá has transitive verbs only when objects are third person. In the other cases a predicative possession is used. When the object is third person, the feature Person takes values combining both arguments (S and O): Person=13, Person=23, and Person=33.

  • Other combinations of person indexes occur in the case of 1 → 2. Two portmanteau pronouns are available: oro- 1 → 2.SG, and opo- 1 → 2.PL:

orojuká 
oro-juká
1.2SG(O)-kill
'I/We kill(ed) you'
opojuká 
opo-juká
1.2PL(O)-kill
'I/we kill(ed) you'
  • What has been traditionally called circunstantial mood or indicative II in some Tupí-Guaraní languages referes to a nominalization accompanied by the fronting of an adverbial expression: adverbs, adverbial expressions, postpositional phrases (oblique topicalization). The nominalized form in this case is marked by the feature-value OblTop=Yes.

Arguments (or potentially refering expressions)

Since all lexical roots in Tupinambá are predicates, their use as potentially refering expressions or arguments require aditional morphology. Compare both examples below:

nerub
ne=r-ub
2.SG=Cont-father
'I have a father'
neruβa
ne=r-ub-a
2.SG=Cont-father-Ref
'My father'

Treebanks

There is 1 Tupinamba UD treebank:


Instruction: Treebank-specific pages are generated automatically from the README file in the treebank repository and from the data in the latest release. Link to the respective *-index.html page in the treebanks folder, using the language code and the treebank code in the file name.