Skip to content

Match hooks

melisa-qordoba edited this page Sep 23, 2020 · 1 revision

Match hooks

Match hooks are powerful and somewhat confusing. replaCy provides a starting kit of hooks, but since they are just Python functions, you can supply your own. To see all the built in hooks, see default_match_hooks.py. An example is preceded_by_pos, which is copied here in full. Notice the signature of the function; if this interests you, see the next subsection, "Hooks Return Predicates".

SpacyMatchPredicate = Callable[[Doc, int, int], bool]

def preceded_by_pos(pos) -> SpacyMatchPredicate:
    if isinstance(pos, list):
        pos_list = pos

        def _preceded_by_pos(doc, start, end):
            bools = [doc[start - 1].pos_ == p for p in pos_list]
            return any(bools)

        return _preceded_by_pos
    elif isinstance(pos, str):
        return lambda doc, start, end: doc[start - 1].pos_ == pos
    else:
        raise ValueError(
            "args of preceded_by_pos should be a string or list of strings"
        )

This allows us to put in our match_dict.json a hook that effectively says "only do this spaCy match is the preceding POS tag is pos, where pos is either a string, like "NOUN", or a list such as ["NOUN", "PROPN"]. Here is the most complicated replaCy match I have written, which demonstrates the use of many hooks:

{
  "require": {
    "patterns": [
      {
        "LEMMA": "require",
        "POS": "VERB",
        "DEP": {
          "NOT_IN": ["amod"]
        },
        "TEMPLATE_ID": 1
      }
    ],
    "suggestions": [
      [
        {
          "TEXT": "need",
          "FROM_TEMPLATE_ID": 1
        }
      ]
    ],
    "match_hook": [
      {
        "name": "succeeded_by_phrase",
        "args": "that",
        "match_if_predicate_is": false
      },
      {
        "name": "succeeded_by_phrase",
        "args": "of",
        "match_if_predicate_is": false
      },
      {
        "name": "preceded_by_dep",
        "args": "auxpass",
        "match_if_predicate_is": false
      },
      {
        "name": "relative_x_is_y",
        "args": ["children", "dep", "csubj"],
        "match_if_predicate_is": false
      }
    ],
    "test": {
      "positive": [
        "Those require more consideration.",
        "Your condition is serious and requires surgery.",
        "I require stimulants to function."
      ],
      "negative": [
        "My pride requires of me that I tell you to piss off.",
        "Is there any required reading?",
        "I am required to tell you that I am a registered Mex offender - I make horrible nachos.",
        "Deciphering the code requires an expert.",
        "Making small models requires manual skill."
      ]
    },
    "comment": "The pattern includes DEP NOT_IN amod because of expresssions like 'required reading' and the relative_x_is_y hook is because this doesn't work for clausal subjects"
  }
}

Clone this wiki locally