Merge pull request cgpotts#54 from cgpotts/20200322-minor-edits

Minor edits to rel_ext notebooks
amitagondi · Mar 23, 2020 · 3bdb224 · 3bdb224
2 parents 4ef26f7 + 3db0d48
commit 3bdb224
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 3 deletions.
diff --git a/rel_ext_01_task.ipynb b/rel_ext_01_task.ipynb
@@ -105,7 +105,7 @@
     "\n",
     "*  Make sure your environment includes all the requirements for [the cs224u repository](https://github.com/cgpotts/cs224u).\n",
     "\n",
-    "* If you haven't already, download [the course data](http://web.stanford.edu/class/cs224u/data/data.zip), unpack it, and place it in the directory containing the course repository – the same directory as this notebook. (If you want to put it somewhere else, change `rel_ext_data_home` below.)"
+    "* If you haven't already, download [the course data](http://web.stanford.edu/class/cs224u/data/data.tgz), unpack it, and place it in the directory containing the course repository – the same directory as this notebook. (If you want to put it somewhere else, change `rel_ext_data_home` below.)"
    ]
   },
   {

diff --git a/rel_ext_02_experiments.ipynb b/rel_ext_02_experiments.ipynb
@@ -289,7 +289,7 @@
    "source": [
     "### Experiments\n",
     "\n",
-    "Now we need some functions to train models, make predictions, and evaluate the results. We'll start with `train_models()`. This function takes as arguments a dictionary of data splits, a list of featurizers, the name of the split on which to train, and a model factory, which is a function which initializes an `sklearn` classifier. It returns a dictionary holding the featurizers, the vectorizer that was used to generate the training matrix, and a dictionary holding the trained models, one per relation."
+    "Now we need some functions to train models, make predictions, and evaluate the results. We'll start with `train_models()`. This function takes as arguments a dictionary of data splits, a list of featurizers, the name of the split on which to train (by default, 'train'), and a model factory, which is a function which initializes an `sklearn` classifier (by default, a logistic regression classifier). It returns a dictionary holding the featurizers, the vectorizer that was used to generate the training matrix, and a dictionary holding the trained models, one per relation."
    ]
   },
   {
@@ -650,7 +650,7 @@
     "\n",
     "Another way to gain insight into our trained models is to use them to discover new relation instances that don't currently appear in the KB. In fact, this is the whole point of building a relation extraction system: to extend an existing KB (or build a new one) using knowledge extracted from natural language text at scale. Can the models we've trained do this effectively?\n",
     "\n",
-    "Because the goal is to discover new relation instances which are _true_ but _absent from the KB_, we can't evaluate this capability automatically. But we can generate candidate KB triples and manually evaluate them for correctness.\n",
+    "Because the goal is to discover new relation instances which are *true* but *absent from the KB*, we can't evaluate this capability automatically. But we can generate candidate KB triples and manually evaluate them for correctness.\n",
     "\n",
     "To do this, we'll start from corpus examples containing pairs of entities which do not belong to any relation in the KB (earlier, we described these as \"negative examples\"). We'll then apply our trained models to each pair of entities, and sort the results by probability assigned by the model, in order to find the most likely new instances for each relation."
    ]
-Original file line number
+Diff line change
@@ Expand Up / @@ -105,7 +105,7 @@ @@
         "\n",
         "*  Make sure your environment includes all the requirements for [the cs224u repository](https://github.com/cgpotts/cs224u).\n",
         "\n",
-        "* If you haven't already, download [the course data](http://web.stanford.edu/class/cs224u/data/data.zip), unpack it, and place it in the directory containing the course repository – the same directory as this notebook. (If you want to put it somewhere else, change `rel_ext_data_home` below.)"
+        "* If you haven't already, download [the course data](http://web.stanford.edu/class/cs224u/data/data.tgz), unpack it, and place it in the directory containing the course repository – the same directory as this notebook. (If you want to put it somewhere else, change `rel_ext_data_home` below.)"
        ]
       },
       {
@@ Expand Down @@