From 398e88cfe344f69574e08a2419e8346511c96561 Mon Sep 17 00:00:00 2001
From: Insop Song <insop.song@gmail.com>
Date: Sun, 25 Jul 2021 12:40:43 -0700
Subject: [PATCH] Revert previous commit, winnow is correct as "narrow down"

---
 vsm_01_distributional.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/vsm_01_distributional.ipynb b/vsm_01_distributional.ipynb
index df569e9..aa0812b 100644
--- a/vsm_01_distributional.ipynb
+++ b/vsm_01_distributional.ipynb
@@ -180,7 +180,7 @@
     "1. Scan through your corpus building a dictionary $d$ mapping word-pairs to co-occurrence values. Every time a pair of words $w$ and $w'$ occurs in the same context (as you defined it in 1), increment $d[(w, w')]$ by whatever value is determined by your weighting scheme. You'd increment by $1$ with the weighting scheme that simply counts co-occurrences.\n",
     "\n",
     "1. Using the count dictionary $d$ that you collected in 3, establish your full vocabulary $V$, an ordered list of words types. \n",
-    "    1. For large collections of documents, $|V|$ will typically be huge. You will probably want to window the vocabulary at this point. \n",
+    "    1. For large collections of documents, $|V|$ will typically be huge. You will probably want to winnow (narrow down) the vocabulary at this point. \n",
     "    1. You might do this by filtering to a specific subset, or just imposing a minimum count threshold. \n",
     "    1. You might impose a minimum count threshold even if $|V|$ is small — for words with very low counts, you simply don't have enough evidence to support good representations.\n",
     "    1. For words outside the vocabulary you choose, you could ignore them entirely or accumulate all their values into a designated _UNK_ vector.\n",