-
Notifications
You must be signed in to change notification settings - Fork 44
Closed
Labels
good first issueGood for newcomersGood for newcomersoutreachyIssues targeted at Outreachy applicantsIssues targeted at Outreachy applicants
Description
Why it matters
The monitoring dashboard needs reliable curve metrics. Kaun.Metrics.auc_roc is currently a placeholder that always raises. Without a real implementation you cannot plot ROC curves or compare models.
How to see the gap
Open kaun/lib/kaun/metrics.ml around the auc_roc definition. The compute function simply calls failwith. The unit tests in kaun/test/test_metrics.ml also skip any AUC checks.
Your task
- Implement trapezoidal integration for the AUC-ROC metric. Accept the existing
num_thresholdsandcurvearguments so callers can request either the scalar area or the raw curve points. - Store intermediate true-positive and false-positive counts in the metric state so updates can be streamed batch by batch.
- Add tests in
kaun/test/test_metrics.mlthat feed known score/label pairs and compare the result against pre-computed values (you can compute the numbers in Python once and paste them as constants).
Tips
- Start by sorting predictions and accumulating TP/FP counts; the integration step is a simple sum of trapezoids.
- Keep numerical stability in mind—add a tiny epsilon where you divide to avoid
NaNin edge cases. - If
curve = true, return both the AUC scalar and the list of(fpr, tpr)points so future visualisations can use them.
Done when
auc_rocreturns a scalar area for the default path and optional curve data when requested.- The new tests pass and cover both balanced and imbalanced label sets.
dune runtest kaunsucceeds after your changes.
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomersoutreachyIssues targeted at Outreachy applicantsIssues targeted at Outreachy applicants