Ready - improve relevance description

pender · pender · commit bd3f6930e8a1 · 2017-01-03T22:10:49.000-08:00
diff --git a/README.md b/README.md
@@ -44,7 +44,7 @@ Try playing around with the arguments to `chatbot.py` to obtain better samples:
 
 - **temperature**: At each step, the model ascribes a certain probability to each character. Temperature can adjust the probability distribution. 1.0 is neutral (and the default), lower values increase high probability values and decrease lower probability values to make the choices more conservative, and higher values will do the reverse. Values outside of the range of 0.5-1.5 are unlikely to give coherent results.
 
-- **relevance**: Relevance is disabled by default. When enabled, two models are run in parallel: the primary model and the mask model. The mask model is scaled by the relevance value, and then the probabilities of the primary model are multiplied by the complement of the mask model before sampling. The state of the mask model is reset upon each newline character. The net effect is that the model is encouraged to choose a line of dialogue that is most relevant to the prior line of dialogue, even if a more generic response (e.g. "I don't know anything about that") may be more absolutely probable. Lower relevance values put more pressure on the model to produce relevant responses, at the cost of the coherence of the responses. Going much below 1.5 compromises the quality of the responses; 2-3 is the recommended range. However, relevance is disabled by default as it halves the speed of sampling and I'm not confident that it improved the outputs.
+- **relevance**: Two models are run in parallel: the primary model and the mask model. The mask model is scaled by the relevance value, and then the probabilities of the primary model are multiplied by the complement of the mask model before sampling. The state of the mask model is reset upon each newline character. The net effect is that the model is encouraged to choose a line of dialogue that is most relevant to the prior line of dialogue, even if a more generic response (e.g. "I don't know anything about that") may be more absolutely probable. Lower relevance values put more pressure on the model to produce relevant responses, at the cost of the coherence of the responses. Going much below 1.5 compromises the quality of the responses; 2-3 is the recommended range. Setting it to a negative value disables relevance, and this is the default, because I'm not confident that it qualitatively improves the outputs and it halves the speed of sampling.
 
 ### Get training data