Skip to content

Sentence Tiers

DarthCadeus edited this page Aug 6, 2018 · 8 revisions

Table of Contents

Created by gh-md-toc

Summary

Sentence Tiering is a way to classify sentences based on their complexity.

Description of Each Tier

Tier 1

Tier 1.1

The most basic sentence, following a strict subject+predicate+object or subject+predicate structure. In this case, the subject, predicate, and object have to be individual words, instead of phrases. The possible form for each are listed below (from wikipedia):

Subject forms

  • Noun or pronoun
  • A gerund
  • An expletive

Predicate forms

  • verb

Object forms

same as Subject forms

Tier 1.2

Expanded based on Tier 1.1 sentences. It introduces some very basic phrase structure. The expansions are listed below. Those that remain the same are not listed.

Subject forms

  • Noun or pronoun (phrase)
  • A gerund (phrase)

The phrase here is comprised of, (all of which has to go before the noun)

Phrase structure

  • determiner
  • attribute adjectives

Tier 1.3

Added more components to the phrase structure introduced in Tier 1.2. The additions are shown below,

Phrase structure

  • noun adjuncts
  • attribute adjectives (with adverbs)

Tier 1.4

Allows to infinitives to become subjects and objects, the change is specified below

Subject forms

  • To infinitive

Tier 1.5

Allows to infinitives to be phrases, with the allowed forms specified below,

to-infinitive phrase structure

  1. the to-infinitive itself
  2. adverbial modifiers

Tier 2

Tier 2 is the refinement update on Tier 1, handling some previously unaddressed problems. Tier 2 might not even qualify as a Tier, as it does not include any new components, but it does further classify the exisiting components.

Tier 2.1

Handles linking verbs and subject complements. For example, in the sentence "The car is red", "red" is the subject complement, which solves our classification issue one. In classification issue one, very delicious is the subject complement to the cake. Currently, we have no method for determining "be" equivalent verbs.

Tier 2.2

Confirms subject with subject-verb agreement. Currently ignores subjunctive mood situations. Currently does not do anything with the data, just keeps it for reference.

Tier 2.2.1

Was not given the designation of Tier 2.3 because it only handles possible issues with the POS tagging, which is not entirely accurate. It looks for missing parts of speech, and if parts of speech are missing, tries to reassign POS values to get a full result based on common word tags.

Tier 2.3

An extension of Tier 2.2, checks subjects against the nominative case of the pronoun.

Tier 3

Tier 2 Sentences expand on Tier 2 Sentences with better and more specific classification. Up to t.2.3, we have been ignoring the sentence mood in favor of treating all sentences as one. This means that we are not able to handle questions, imperative or sentences in passive voice. This Tier, t.3 aims to settle this problem.

Tier 3.1

This tier handles cases of passive voice. According to wikipedia, in English, passive voice is marked by a "be" or "get" verb followed by the past-participle of the noun, and this is the method through which we will identify use of passive voice. As we already have a suitable method for identifying "be" verbs, and marking them in obj_cmp, we can use this in our identification of the passive

Tier 3.2

This tier is a deviation from the standard goal of Tier 3, and probably should be incorporated in Tier 1, but since Tier 1 is already sealed, it is necessary to put it in Tier 3. This Tier identifies the particles behind verbs and adds them to the characteristic of the predicate.

Tier 3.3

This tier adjusts subject-object positions. If there is no object to place in the position of subject, will put &any from EC.Pointer. Will also improve subject-verb disagreement detection. Also adopts the new OOP approach to sentence, another sentence reform which has full backwards compatibility.

Tier 3.4

This tier handles the case of appositions, in which two nouns combine to describe the same thing, for example, in the phrase "Ti'anmen Square", "Ti'anmen" and "Square" cannot be interpreted independently, but should be understood as one. Currently, we would be marking "Ti'anmen" as a noun adjunct of "Square", which is incorrect.

Tier 3.5

This tier identifies and handles imperative sentences. Currently we take a primitive approach where imperative sentences are identified by their punctuation, which is the exclamation mark. However, this method has limitations, and should be revised as soon as possible. This is of course not the sole criteria. If it turns out that the sentence processes normally a a non-imperative, priority is given to the non-imperative form, as imperative sentences lack a crucial component and should not be able to be interpreted by a normal processor. Here, we require a new function which calls the t.3.4 one, since we would need to test for the processed results. This will take priority over the process_2_1 function which we use to handle tagging errors.

Tier 3.6

Our t.3.5 function cannot work if it is a special imperative, beginning with "let". Tier 3.6 means to solve this problem. It deals with problems of "let's" by removing the "let" and inserting a "We" in its place.

Tier 3.7

Previous functions in the tiers t.3.5 and t.3.6 only remove or add words, which means we can end up with a subject pronoun nominative case mismatch, which will interfere with our processing function. Tier 3.7 will inflect these pronouns to their proper forms using an inflections table added in NLPExtension.

Known Issues

Issues with identification

  1. As in the phrase: violently shaking dog, shaking is identified as a noun adjunct.

Issues with classification

  1. As in the sentence: The cake was very delicious, is there an object? If so, then can adjectives become objects as well? If not, then what is very delicious? RESOLVED