Skip to content

Commit

Permalink
Documentation for type weight
Browse files Browse the repository at this point in the history
  • Loading branch information
jnothman committed May 30, 2018
1 parent 8bbf471 commit ca68019
Show file tree
Hide file tree
Showing 3 changed files with 74 additions and 0 deletions.
60 changes: 60 additions & 0 deletions doc/approximate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,63 @@ Caveats:
* There is (currently) no equivalent implementation for clustering metrics.

.. _LoReHLT: https://www.nist.gov/sites/default/files/documents/itl/iad/mig/LoReHLT16EvalPlan_v1-01.pdf

.. _approx_type:

Approximate type matching
-------------------------

Rather than exactly matching entity types, they can be matched using arbitrary
weights. These can be specified to :ref:`command_evaluate` with
``--type-weights``. This option accepts a tab-delimited file with three
columns:

* gold type
* system type
* weight

For types not in this weight file, exact matches between gold type and system
type score 1, and otherwise score is 0.

The following example scores 0.123 where the gold type is ``type1`` and the
system type is ``type2``.

.. command-output:: bash -c " \
neleval evaluate --by-doc \
-m strong_typed_mention_match \
--type-weights <(echo -e 'type1\ttype2\t0.123') \
--gold <( \
echo -e 'doc1\t10\t20\tkbid\t1.0\ttype1'; \
echo -e 'doc2\t10\t20\tkbid\t1.0\ttype1'; \
echo -e 'doc3\t10\t20\tkbid\t1.0\ttype2'; \
echo -e 'doc4\t10\t20\tkbid\t1.0\ttype1'; \
echo -e 'doc4\t30\t40\tkbid\t1.0\ttype1'; \
) <( \
echo -e 'doc1\t10\t20\tkbid\t1.0\ttype2'; \
echo -e 'doc2\t10\t20\tkbid\t1.0\ttype1'; \
echo -e 'doc3\t10\t20\tkbid\t1.0\ttype1'; \
echo -e 'doc4\t10\t20\tkbid\t1.0\ttype2'; \
echo -e 'doc4\t30\t40\tkbid\t1.0\ttype2'; \
) \
"
:nostderr:

This currently only applies to measures with the ``sets`` :ref:`aggregator
<measure_aggregator>`.

Type match weighting with a hierarchy
.....................................

:ref:`command_weights_for_hierarchy` converts a hierarchy of types into the
above ``--type-weights`` format. It uses a scheme with a decay parameter
:math:`0 < d < 1`, such that a system mention is awarded:

* 0 if its type is not identical to or an ancestor of the gold type
* :math:`d ^ {\mathrm{depth}(\mathrm{goldtype})-\mathrm{depth}(\mathrm{systype})}` if its type is an ancestor of the gold type

Thus:

* :math:`d` if its type is a parent of the gold type
* :math:`d ^ 2` if its type is a grandparent of the gold type

etc.
13 changes: 13 additions & 0 deletions doc/commands/weights-for-hierarchy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,21 @@

Translate a hierarchy of types into a sparse matrix of type-pair weights

See :ref:`approx_type`.

Usage summary
.............

.. command-output:: neleval weights-for-hierarchy --help

Converting JSON type hierarchy to weights
.........................................

.. command-output:: bash -c "\
neleval weights-for-hierarchy --decay 0.5 <( \
echo '{\"root\": [\"A\", \"B\"], \"A\": [\"A1\", \"A2\"], \"B\": [\"B1\"], \"B1\": [\"B1i\"]}' \
) \
"

These weights can be applied to evaluation with :ref:`command_evaluate`'s
``--type-weight`` option.
1 change: 1 addition & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
'sphinxcontrib.programoutput',
'sphinx.ext.ifconfig',
'sphinx.ext.viewcode',
'sphinx.ext.mathjax',
'sphinx_issues',
]

Expand Down

0 comments on commit ca68019

Please sign in to comment.