-
Notifications
You must be signed in to change notification settings - Fork 1.9k
API reference - Updated trainer docs for AveragedPerceptron #3310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API reference - Updated trainer docs for AveragedPerceptron #3310
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3310 +/- ##
==========================================
- Coverage 72.64% 72.64% -0.01%
==========================================
Files 807 807
Lines 145190 145191 +1
Branches 16223 16223
==========================================
Hits 105480 105480
Misses 35293 35293
- Partials 4417 4418 +1
|
@@ -383,11 +383,11 @@ public static class StandardTrainersCatalog | |||
} | |||
|
|||
/// <summary> | |||
/// Predict a target using a linear binary classification model trained with <see cref="AveragedPerceptronTrainer"/>. | |||
/// Create an <see cref="AveragedPerceptronTrainer"/>, which predicts a target using a linear binary classification model trained over boolean label data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boolean label data [](start = 143, length = 18)
over a label of boolean data. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -420,7 +420,7 @@ public static class StandardTrainersCatalog | |||
} | |||
|
|||
/// <summary> | |||
/// Predict a target using a linear binary classification model trained with <see cref="AveragedPerceptronTrainer"/> and advanced options. | |||
/// Create an <see cref="AveragedPerceptronTrainer"/> with advanced options, which predicts a target using a linear binary classification model trained over boolean label data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
over boolean label data [](start = 160, length = 23)
over a label of boolean data. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to XML encode on md. YAY! :) #Resolved Refers to: src/Microsoft.ML.StandardTrainers/Standard/Online/AveragedPerceptron.cs:53 in a9788b3. [](commit_id = a9788b3, deletion_comment = False) |
/// | ||
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)] | ||
/// | ||
/// ### Trainer FAQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trainer [](start = 12, length = 7)
shall we call it estimator? #WontFix
/// </summary> | ||
/// <remarks> | ||
/// <format type="text/markdown">< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trainer [](start = 23, length = 7)
estimator? #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, aside from a couple of minor comments.
/// | ||
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)] | ||
/// | ||
/// ### Trainer FAQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that we should call it estimator.
And how about Features (instead of FAQ), so that would be "Estimator Features"? #WontFix
changed. In reply to: 482660867 [](ancestors = 482660867) Refers to: src/Microsoft.ML.StandardTrainers/Standard/Online/AveragedPerceptron.cs:53 in a9788b3. [](commit_id = a9788b3, deletion_comment = False) |
/// | Machine learning task | Binary classification | | ||
/// | Is normalization required? | Yes | | ||
/// | Is caching required? | No | | ||
/// | Additional required NuGet | None | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Additional required NuGet | None [](start = 8, length = 35)
this line is the one I am debating on the other PR.
If we want to keep it, shall we at least have it in the format:
| NuGet | Microsoft.ML |
Someone landing in this page from a web search would be clueless about what is the additional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,8 @@ | |||
### Input and Output Columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks the information below is not related to Input
? Maybe you want to add the two types of inputs for binary classifiers --- single float column and multiple float columns (FFM).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the input information is the first line above the table about label. Because it doesn't have a fixed name (unlike output columns) it's just written in text.
FFM is a special binary classifier. For that we can add the multiple float input.
In reply to: 275443192 [](ancestors = 275443192)
/// | Is caching required? | No | | ||
/// | Additional required NuGet | None | | ||
/// | ||
/// ### Training Algorithm Details | ||
/// The perceptron is a classification algorithm that makes its predictions by finding a separating hyperplane. | ||
/// For instance, with feature values f0, f1,..., f_D-1, the prediction is given by determining what side of the hyperplane the point falls into. | ||
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm. | |
/// That is the same as the sign of sigma[0, D-1] \sum_{i = 1}^n (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm and n is the length of feature vector. | |
``` #Resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// For instance, with feature values f0, f1,..., f_D-1, the prediction is given by determining what side of the hyperplane the point falls into. | ||
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm. | ||
/// For instance, with feature values $f0, f1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into. | ||
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm. | |
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_i$ is the i-th feature's coefficient computed by the algorithm. |
This PR applies the template discussed #3218 to AveragedPerceptron. It serves as reference PR for updating the rest of the trainers.
The following pages are best-effort (90%) recreation of what these changes will look like. Although some changes will only be visible this PR and checked in and preview site is updated.
Extension methods:
AveragedPerceptronTrainer
AveragedPerceptronTrainer.Options