Skip to content

Commit 219ed7b

Browse files
yueguoguosrowen
authored andcommitted
[DOCS] Fixed NDCG formula issues
When j is 0, log(j+1) will be 0, and this leads to division by 0 issue. ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. Closes apache#22090 from yueguoguo/patch-1. Authored-by: Zhang Le <yueguoguo@users.noreply.github.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>
1 parent 60af250 commit 219ed7b

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

docs/mllib-evaluation-metrics.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -462,13 +462,13 @@ $$rel_D(r) = \begin{cases}1 & \text{if $r \in D$}, \\ 0 & \text{otherwise}.\end{
462462
<td>Normalized Discounted Cumulative Gain</td>
463463
<td>
464464
$NDCG(k)=\frac{1}{M} \sum_{i=0}^{M-1} {\frac{1}{IDCG(D_i, k)}\sum_{j=0}^{n-1}
465-
\frac{rel_{D_i}(R_i(j))}{\text{ln}(j+1)}} \\
465+
\frac{rel_{D_i}(R_i(j))}{\text{ln}(j+2)}} \\
466466
\text{Where} \\
467467
\hspace{5 mm} n = \text{min}\left(\text{max}\left(|R_i|,|D_i|\right),k\right) \\
468-
\hspace{5 mm} IDCG(D, k) = \sum_{j=0}^{\text{min}(\left|D\right|, k) - 1} \frac{1}{\text{ln}(j+1)}$
468+
\hspace{5 mm} IDCG(D, k) = \sum_{j=0}^{\text{min}(\left|D\right|, k) - 1} \frac{1}{\text{ln}(j+2)}$
469469
</td>
470470
<td>
471-
<a href="https://en.wikipedia.org/wiki/Information_retrieval#Discounted_cumulative_gain">NDCG at k</a> is a
471+
<a href="https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG">NDCG at k</a> is a
472472
measure of how many of the first k recommended documents are in the set of true relevant documents averaged
473473
across all users. In contrast to precision at k, this metric takes into account the order of the recommendations
474474
(documents are assumed to be in order of decreasing relevance).

0 commit comments

Comments
 (0)