Skip to content

Predicting probabilities

Griffin Bassman edited this page Nov 17, 2021 · 9 revisions

Binary classification

By default, VW predictions are in the range [-50, +50] (theoretically any real number, but VW truncates it to [-50, +50]). To convert them to {-1, +1}, use --binary – positive predictions are simply mapped to +1, negative to -1. To convert them to [0, +1], use --link=logistic. This uses the logistic function 1/(1 + exp(-x)). If you use --loss_function=logistic, you can interpret the numbers as probabilities of the +1 class. So to output the probability prediction to a file use --loss_function=logistic --link=logistic -p probs.txt.

(For completeness, to convert the predictions to [-1, +1], use --link=glf1. This uses formula 2/(1 + exp(-x)) - 1, i.e. generalized logistic function with limits of 1.)

Multi-class --oaa

By default in oaa, VW predicts the label of the most probable class. To get probabilities for each of the 1..N classes, simply use --oaa N --loss_function=logistic --probabilities -p probs.txt. You can also get probabilities for 0-indexed classes (0...N-1) if you specify the --indexing 0 flag. The numbers (probabilities) will be normalized so they sum up to 1. --link logistic will automatically be applied, so feel free to include or leave out this flag.

When using --probabilities during training (or testing with gold costs available), a multi-class logistic loss is reported in addition to the standard zero-one loss. The option --probabilities is stored in the model, so it does not need (actually cannot) be repeated when testing.

Example: Command: vw -d oaa.dat --probabilities --oaa=4 --loss_function=logistic -p oaa_probabilities.predict

oaa.dat (input):

1 | a b c
3 | b c d
1 | a c e
4 | b d f
2 | d e f

oaa_probabilities.predict (output):

1:0.25 2:0.25 3:0.25 4:0.25
1:0.37644 2:0.207853 3:0.207853 4:0.207853
1:0.350701 2:0.188793 3:0.271713 4:0.188793
1:0.310118 2:0.180659 3:0.328564 4:0.180659
1:0.291472 2:0.155959 3:0.248878 4:0.303691

Multi-class --csoaa_ldf

For cost-sensitive oaa with label-dependent features, use --csoaa_ldf=mc --loss_function=logistic --probabilities -p probs.txt. Note that the probabilities here are also normalized and will sum to 1 across all classes in an example. --link logistic will automatically be applied, so feel free to include or leave out this flag.

Example: Command: vw -d csoaa_ldf.dat --probabilities --csoaa_ldf=mc --loss_function=logistic -p csoaa_ldf_probabilities.predict

csoaa_ldf.dat (input):

1:1.0 | a_1 b_1 c_1
2:0.0 | a_2 b_2 c_2
3:2.0 | a_3 b_3 c_3

1:1.0 | b_1 c_1 d_1
2:0.0 | b_2 c_2 d_2

1:1.0 | a_1 b_1 c_1
3:2.0 | a_3 b_3 c_3

csoaa_ldf_probabilities.predict (output):

0.333333
0.333333
0.333333

0.382447
0.617553

0.491507
0.508493

Multi-class --multilabel_oaa

multilabel_oaa is similar to regular oaa, however you can specify multiple classes for each example. It is very import to note that multilabel_oaa is 0-indexed, whereas oaa and csoaa_ldf are 1-indexed. The probabilities in multilabel_oaa are not normalized, thus the sum of all probabilities across each class in an example need not sum to 1. --link logistic will automatically be applied, so feel free to include or leave out this flag.

Example: Command: vw -d multilabel_oaa.dat --probabilities --multilabel_oaa 4 --loss_function=logistic -p multilabel_oaa_probabilities.predict

multilabel_oaa.dat (input):

0,2,3 | a b c
0,3 | b c e
1,3 | a c e
2,3 | d b e

multilabel_oaa_probabilities.predict (output):

0:0.5 1:0.5 2:0.5 3:0.5
 
0:0.644266 1:0.355734 2:0.644266 3:0.644266
 
0:0.738168 1:0.261832 2:0.510197 3:0.738168
 
0:0.614635 1:0.385365 2:0.399657 3:0.740691
Clone this wiki locally