Skip to content

Commit

Permalink
change logloss to auc
Browse files Browse the repository at this point in the history
  • Loading branch information
Anson Chu committed Apr 1, 2019
1 parent 002849a commit 311c60b
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 12 deletions.
14 changes: 7 additions & 7 deletions example_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

import pandas as pd
import numpy as np
from sklearn import metrics, preprocessing, linear_model
from sklearn import metrics, linear_model


def main():
Expand Down Expand Up @@ -78,11 +78,11 @@ def main():
print("- elizabeth using bernie:",
sum(correct) / float(validation.shape[0]))

# Numerai measures models on logloss instead of accuracy. The lower the logloss the better.
# Numerai only pays models with logloss < 0.693 on the live portion of the tournament data.
# Our validation logloss isn't very good.
print("- validation logloss:",
metrics.log_loss(validation['target_bernie'], probabilities))
# Numerai measures models on AUC. The higher the AUC the better.
# Numerai only pays models with AUC that beat the benchmark on the live portion of the tournament data.
# Our validation AUC isn't very good.
print("- validation AUC:",
metrics.roc_auc_score(validation['target_bernie'], probabilities))

# To submit predictions from your model to Numerai, predict on the entire tournament data.
x_prediction = tournament[features]
Expand Down Expand Up @@ -121,7 +121,7 @@ def main():
3. Use all the targets
As we saw above, a model trained on one target like target_bernie might be good at predicting another target
like target_elizabeth. Blending models built on each target could also improve your logloss and consistency.
like target_elizabeth. Blending models built on each target could also improve your AUC.
"""

if __name__ == '__main__':
Expand Down
10 changes: 5 additions & 5 deletions example_model.r
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,10 @@ cor(validation$target_bernie, validation$target_elizabeth)
#you can see that target_elizabeth is accurate using the bernie model as well
sum(round(probabilities)==validation$target_elizabeth)/nrow(validation)

#Numerai measures models on logloss instead of accuracy. The lower the logloss the better.
#Numerai only pays models with logloss < 0.693 on the live portion of the tournament data.
#Our validation logloss isn't very good.
logLoss(validation$target_bernie, probabilities)
#Numerai measures models on AUC. The higher the AUC the better.
#Numerai only pays models with AUC that beat the benchmark on the live portion of the tournament data.
#Our validation AUC isn't very good.
auc(validation$target_bernie, probabilities)

#to submit predictions from your model to Numerai, predict on the entire tournament data
tournament$probability_bernie<-predict.gbm(model, tournament, n.trees=10, type="response")
Expand Down Expand Up @@ -72,4 +72,4 @@ write.csv(submission, file="bernie_submission2.csv", row.names=F)

#3. Use all the targets
#As we saw above, a model trained on one target like target_bernie might be good at predicting another target
#like target_elizabeth. Blending models built on each target could also improve your logloss and consistency.
#like target_elizabeth. Blending models built on each target could also improve your AUC.

0 comments on commit 311c60b

Please sign in to comment.