-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Predicting probabilities
By default, VW predictions are in the range [-50, +50]
(theoretically any real number, but VW truncates it to [-50, +50]).
To convert them to {-1, +1}, use --binary
– positive predictions are simply mapped to +1, negative to -1.
To convert them to [0, +1], use --link=logistic
. This uses the logistic function 1/(1 + exp(-x)). If you use --loss_function=logistic
, you can interpret the numbers as probabilities of the +1 class. So to output the probability prediction to a file use --loss_function=logistic --link=logistic -p probs.txt
.
(For completeness, to convert the predictions to [-1, +1], use --link=glf1
. This uses formula 2/(1 + exp(-x)) - 1, i.e. generalized logistic function with limits of 1.)
By default in oaa, VW predicts the label of the most probable class.
To get probabilities for each of the 1..N classes, simply use --oaa N --loss_function=logistic --probabilities -p probs.txt
. You can also get probabilities for 0-indexed classes (0...N-1) if you specify the --indexing 0
flag. The numbers (probabilities) will be normalized so they sum up to 1. --link logistic
will automatically be applied, so feel free to include or leave out this flag.
When using --probabilities
during training (or testing with gold costs available), a multi-class logistic loss is reported in addition to the standard zero-one loss.
The option --probabilities
is stored in the model, so it does not need (actually cannot) be repeated when testing.
Example:
Command: vw -d oaa.dat --probabilities --oaa=4 --loss_function=logistic -p oaa_probabilities.predict
oaa.dat (input):
1 | a b c
3 | b c d
1 | a c e
4 | b d f
2 | d e f
oaa_probabilities.predict (output):
1:0.25 2:0.25 3:0.25 4:0.25
1:0.37644 2:0.207853 3:0.207853 4:0.207853
1:0.350701 2:0.188793 3:0.271713 4:0.188793
1:0.310118 2:0.180659 3:0.328564 4:0.180659
1:0.291472 2:0.155959 3:0.248878 4:0.303691
For cost-sensitive oaa with label-dependent features,
use --csoaa_ldf=mc --loss_function=logistic --probabilities -p probs.txt
. Note that the probabilities here are also normalized and will sum to 1 across all classes in an example. --link logistic
will automatically be applied, so feel free to include or leave out this flag.
Example:
Command: vw -d csoaa_ldf.dat --probabilities --csoaa_ldf=mc --loss_function=logistic -p csoaa_ldf_probabilities.predict
csoaa_ldf.dat (input):
1:1.0 | a_1 b_1 c_1
2:0.0 | a_2 b_2 c_2
3:2.0 | a_3 b_3 c_3
1:1.0 | b_1 c_1 d_1
2:0.0 | b_2 c_2 d_2
1:1.0 | a_1 b_1 c_1
3:2.0 | a_3 b_3 c_3
csoaa_ldf_probabilities.predict (output):
0.333333
0.333333
0.333333
0.382447
0.617553
0.491507
0.508493
multilabel_oaa
is similar to regular oaa
, however you can specify multiple classes for each example. It is very import to note that multilabel_oaa
is 0-indexed, whereas oaa
and csoaa_ldf
are 1-indexed. The probabilities in multilabel_oaa
are not normalized, thus the sum of all probabilities across each class in an example need not sum to 1. --link logistic
will automatically be applied, so feel free to include or leave out this flag.
Example:
Command: vw -d multilabel_oaa.dat --probabilities --multilabel_oaa 4 --loss_function=logistic -p multilabel_oaa_probabilities.predict
multilabel_oaa.dat (input):
0,2,3 | a b c
0,3 | b c e
1,3 | a c e
2,3 | d b e
multilabel_oaa_probabilities.predict (output):
0:0.5 1:0.5 2:0.5 3:0.5
0:0.644266 1:0.355734 2:0.644266 3:0.644266
0:0.738168 1:0.261832 2:0.510197 3:0.738168
0:0.614635 1:0.385365 2:0.399657 3:0.740691
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: