-
Notifications
You must be signed in to change notification settings - Fork 1.9k
API reference - Updated trainer docs for AveragedPerceptron #3310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
### Input and Output Columns | ||
The input label column data must be <xref:System.Single>. This trainer outputs the following columns: | ||
|
||
| Output Column Name | Column Type | Description| | ||
| -- | -- | -- | | ||
| `Score` | <xref:System.Single> | The unbounded score that was calculated by the trainer to determine the prediction.| | ||
| `PredictedLabel` | <xref:System.Boolean> | The label predicted by the trainer. `false` maps to negative score and `true` maps to positive score.| | ||
| `Probability` | <xref:System.Single> | The probability of the score in range [0, 1].| |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -23,26 +23,46 @@ | |||||
namespace Microsoft.ML.Trainers | ||||||
{ | ||||||
/// <summary> | ||||||
/// The <see cref="IEstimator{TTransformer}"/> for the averaged perceptron trainer. | ||||||
/// The <see cref="IEstimator{TTransformer}"/> to predict a target using a linear binary classification model trained with the averaged perceptron. | ||||||
/// </summary> | ||||||
/// <remarks> | ||||||
/// <format type="text/markdown">< | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
estimator? #WontFix |
||||||
/// or [AveragedPerceptron(Options)](xref:Microsoft.ML.StandardTrainersCatalog.AveragedPerceptron(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,Microsoft.ML.Trainers.AveragedPerceptronTrainer.Options). | ||||||
/// | ||||||
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)] | ||||||
/// | ||||||
/// ### Trainer Characteristics | ||||||
/// | | | | ||||||
/// | -- | -- | | ||||||
/// | Machine learning task | Binary classification | | ||||||
/// | Is normalization required? | Yes | | ||||||
/// | Is caching required? | No | | ||||||
/// | Required NuGet in addition to Microsoft.ML | None | | ||||||
/// | ||||||
/// ### Training Algorithm Details | ||||||
/// The perceptron is a classification algorithm that makes its predictions by finding a separating hyperplane. | ||||||
/// For instance, with feature values f0, f1,..., f_D-1, the prediction is given by determining what side of the hyperplane the point falls into. | ||||||
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm. | ||||||
/// For instance, with feature values $f0, f1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into. | ||||||
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
/// | ||||||
/// The perceptron is an online algorithm, which means it processes the instances in the training set one at a time. | ||||||
/// It starts with a set of initial weights (zero, random, or initialized from a previous learner). Then, for each example in the training set, the weighted sum of the features (sigma[0, D-1] (w_i * f_i)) is computed. | ||||||
/// It starts with a set of initial weights (zero, random, or initialized from a previous learner). Then, for each example in the training set, the weighted sum of the features is computed. | ||||||
/// If this value has the same sign as the label of the current example, the weights remain the same. If they have opposite signs, | ||||||
/// the weights vector is updated by either adding or subtracting (if the label is positive or negative, respectively) the feature vector of the current example, | ||||||
/// multiplied by a factor 0 < a <= 1, called the learning rate. In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate, | ||||||
/// multiplied by a factor 0 < a <= 1, called the learning rate. In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate, | ||||||
/// and by the gradient of some loss function (in the specific case described above, the loss is hinge-loss, whose gradient is 1 when it is non-zero). | ||||||
/// | ||||||
/// In Averaged Perceptron (aka voted-perceptron), for each iteration, i.e. pass through the training data, a weight vector is calculated as explained above. | ||||||
/// The final prediction is then calculate by averaging the weighted sum from each weight vector and looking at the sign of the result. | ||||||
/// | ||||||
/// For more information see <a href="https://en.wikipedia.org/wiki/Perceptron">Wikipedia entry for Perceptron</a> | ||||||
/// or <a href="https://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.8200">Large Margin Classification Using the Perceptron Algorithm</a> | ||||||
/// For more information see [Wikipedia entry for Perceptron](https://en.wikipedia.org/wiki/Perceptron) | ||||||
/// or [Large Margin Classification Using the Perceptron Algorithm](https://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.8200). | ||||||
/// ]]> | ||||||
/// </format> | ||||||
/// </remarks> | ||||||
/// <seealso cref="StandardTrainersCatalog.AveragedPerceptron(BinaryClassificationCatalog.BinaryClassificationTrainers, string, string, IClassificationLoss, float, bool, float, int)" /> | ||||||
/// <seealso cref="StandardTrainersCatalog.AveragedPerceptron(BinaryClassificationCatalog.BinaryClassificationTrainers, AveragedPerceptronTrainer.Options)"/> | ||||||
/// <seealso cref="Options"/> | ||||||
public sealed class AveragedPerceptronTrainer : AveragedLinearTrainer<BinaryPredictionTransformer<LinearBinaryModelParameters>, LinearBinaryModelParameters> | ||||||
{ | ||||||
internal const string LoadNameValue = "AveragedPerceptron"; | ||||||
|
@@ -53,7 +73,8 @@ public sealed class AveragedPerceptronTrainer : AveragedLinearTrainer<BinaryPred | |||||
private readonly Options _args; | ||||||
|
||||||
/// <summary> | ||||||
/// Options for the <see cref="AveragedPerceptronTrainer"/>. | ||||||
/// Options for the <see cref="AveragedPerceptronTrainer"/> as used in | ||||||
/// [AveragedPerceptron(Options)](xref:Microsoft.ML.StandardTrainersCatalog.AveragedPerceptron(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,Microsoft.ML.Trainers.AveragedPerceptronTrainer.Options). | ||||||
/// </summary> | ||||||
public sealed class Options : AveragedLinearOptions | ||||||
{ | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -383,11 +383,11 @@ public static SdcaNonCalibratedMulticlassTrainer SdcaNonCalibrated(this Multicla | |
} | ||
|
||
/// <summary> | ||
/// Predict a target using a linear binary classification model trained with <see cref="AveragedPerceptronTrainer"/>. | ||
/// Create an <see cref="AveragedPerceptronTrainer"/>, which predicts a target using a linear binary classification model trained over boolean label data. | ||
shmoradims marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
over a label of boolean data. #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
/// </summary> | ||
/// <param name="catalog">The binary classification catalog trainer object.</param> | ||
/// <param name="labelColumnName">The name of the label column.</param> | ||
/// <param name="featureColumnName">The name of the feature column.</param> | ||
/// <param name="labelColumnName">The name of the label column. The column data must be <see cref="System.Boolean"/>.</param> | ||
/// <param name="featureColumnName">The name of the feature column. The column data must be a known-sized vector of <see cref="System.Single"/>.</param> | ||
/// <param name="lossFunction">The <a href="tmpurl_loss">loss</a> function minimized in the training process. If <see langword="null"/>, <see cref="HingeLoss"/> would be used and lead to a max-margin averaged perceptron trainer.</param> | ||
/// <param name="learningRate">The initial <a href="tmpurl_lr">learning rate</a> used by SGD.</param> | ||
/// <param name="decreaseLearningRate"> | ||
|
@@ -420,7 +420,7 @@ public static AveragedPerceptronTrainer AveragedPerceptron( | |
} | ||
|
||
/// <summary> | ||
/// Predict a target using a linear binary classification model trained with <see cref="AveragedPerceptronTrainer"/> and advanced options. | ||
/// Create an <see cref="AveragedPerceptronTrainer"/> with advanced options, which predicts a target using a linear binary classification model trained over boolean label data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
over a label of boolean data. #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
/// </summary> | ||
/// <param name="catalog">The binary classification catalog trainer object.</param> | ||
/// <param name="options">Trainer options.</param> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks the information below is not related to
Input
? Maybe you want to add the two types of inputs for binary classifiers --- single float column and multiple float columns (FFM).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the input information is the first line above the table about label. Because it doesn't have a fixed name (unlike output columns) it's just written in text.
FFM is a special binary classifier. For that we can add the multiple float input.
In reply to: 275443192 [](ancestors = 275443192)