Skip to content

Commit

Permalink
Update ACL link
Browse files Browse the repository at this point in the history
  • Loading branch information
insop committed Sep 6, 2020
1 parent 72d0933 commit bc79ad7
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions evaluation_methods.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,7 @@
"\n",
"1. As disussed briefly in [the NLI models notebook](nli_02_models.ipynb#Other-findings), [Leonid Keselman](https://leonidk.com/) observed [in his 2016 NLU course project](https://leonidk.com/stanford/cs224u.html) that one can do much better than chance on SNLI by processing only the hypothesis, ignoring the premise entirely. The exact interpretation of this is complex (we explore this a bit [in our NLI unit](nli_02_models.ipynb#Hypothesis-only-baselines) and [in our NLI bake-off](nli_wordentail.ipynb)), but it's certainly relevant for understanding how much a system has actually learned about reasoning from a premise to a conclusion.\n",
" \n",
"1. [Schwartz et al. (2017)](https://aclanthology.coli.uni-saarland.de/papers/W17-0907/w17-0907) develop a system for choosing between a coherent and incoherent ending for a story. Their best system achieves 75% accuracy by processing the story and the ending, but they achieve 72% using only stylistic features of the ending, ignoring the preceding story entirely. This puts the 75% – and the extent to which the system understands story completion – in a new light."
"1. [Schwartz et al. (2017)](https://www.aclweb.org/anthology/W17-0907) develop a system for choosing between a coherent and incoherent ending for a story. Their best system achieves 75% accuracy by processing the story and the ending, but they achieve 72% using only stylistic features of the ending, ignoring the preceding story entirely. This puts the 75% – and the extent to which the system understands story completion – in a new light."
]
},
{
Expand Down Expand Up @@ -914,7 +914,7 @@
"\n",
"Most deep learning models have their parameters initialized randomly, perhaps according to some heuristics related to the number of parameters ([Glorot and Bengio 2010](http://proceedings.mlr.press/v9/glorot10a.html)) or their internal structure ([Saxe et al. 2014](https://arxiv.org/abs/1312.6120)). This is meaningful largely because of the non-convex optimization problems that these models define, but it can impact simpler models that have multiple optimal solutions that still differ at test time. \n",
"\n",
"There is growing awareness that these random choices have serious consequences. For instance, [Reimers and Gurevych (2017)](https://aclanthology.coli.uni-saarland.de/papers/D17-1035/d17-1035) report that different initializations for neural sequence models can lead to statistically significant results, and they show that a number of recent systems are indistinguishable in terms of raw performance once this source of variation is taken into account.\n",
"There is growing awareness that these random choices have serious consequences. For instance, [Reimers and Gurevych (2017)](https://www.aclweb.org/anthology/D17-1035) report that different initializations for neural sequence models can lead to statistically significant results, and they show that a number of recent systems are indistinguishable in terms of raw performance once this source of variation is taken into account.\n",
"\n",
"This shouldn't surprise practitioners, who have long struggled with the question of what to do when a system experiences a catastrophic failure as a result of unlucky initialization. (I think the answer is to report this failure rate.)\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion nli_01_task_and_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@
"* All the premises are captions from the [Flickr30K corpus](http://shannon.cs.illinois.edu/DenotationGraph/).\n",
"\n",
"\n",
"* Some of the sentences rather depressingly reflect stereotypes ([Rudinger et al. 2017](https://aclanthology.coli.uni-saarland.de/papers/W17-1609/w17-1609)).\n",
"* Some of the sentences rather depressingly reflect stereotypes ([Rudinger et al. 2017](https://www.aclweb.org/anthology/W17-1609)).\n",
"\n",
"\n",
"* 550,152 train examples; 10K dev; 10K test\n",
Expand Down
2 changes: 1 addition & 1 deletion vsm_03_retrofitting.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -668,7 +668,7 @@
"\n",
"* If you think of the input VSM as a \"warm start\" for graph embedding algorithms, then you're essentially retrofitting. This connection opens up a number of new opportunities to go beyond the similarity-based semantics that underlies Faruqui et al.'s model. See [Lengerich et al. 2017](https://arxiv.org/pdf/1708.00112.pdf), section 3.2, for more on these connections.\n",
"\n",
"* [Mrkšić et al. 2016](https://aclanthology.coli.uni-saarland.de/papers/N16-1018/n16-1018) address the limitation of Faruqui et al's model that it assumes connected nodes in the graph are similar. In a graph with complex, varied edge semantics, this is likely to be false. They address the case of antonymy in particular.\n",
"* [Mrkšić et al. 2016](https://www.aclweb.org/anthology/N16-1018) address the limitation of Faruqui et al's model that it assumes connected nodes in the graph are similar. In a graph with complex, varied edge semantics, this is likely to be false. They address the case of antonymy in particular.\n",
"\n",
"* [Lengerich et al. 2017](https://arxiv.org/pdf/1708.00112.pdf) present a __functional retrofitting__ framework in which the edge meanings are explicitly modeled, and they evaluate instantiations of the framework with linear and neural edge penalty functions. (The Faruqui et al. model emerges as a specific instantiation of this framework.)"
]
Expand Down

0 comments on commit bc79ad7

Please sign in to comment.