Skip to content

API reference - Updated trainer docs for AveragedPerceptron #3310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 15, 2019

Conversation

shmoradims
Copy link

@shmoradims shmoradims commented Apr 12, 2019

This PR applies the template discussed #3218 to AveragedPerceptron. It serves as reference PR for updating the rest of the trainers.

The following pages are best-effort (90%) recreation of what these changes will look like. Although some changes will only be visible this PR and checked in and preview site is updated.

Extension methods:

AveragedPerceptronTrainer

AveragedPerceptronTrainer.Options

@codecov
Copy link

codecov bot commented Apr 12, 2019

Codecov Report

Merging #3310 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3310      +/-   ##
==========================================
- Coverage   72.64%   72.64%   -0.01%     
==========================================
  Files         807      807              
  Lines      145190   145191       +1     
  Branches    16223    16223              
==========================================
  Hits       105480   105480              
  Misses      35293    35293              
- Partials     4417     4418       +1
Flag Coverage Δ
#Debug 72.64% <ø> (-0.01%) ⬇️
#production 68.17% <ø> (-0.01%) ⬇️
#test 88.97% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...dardTrainers/Standard/Online/AveragedPerceptron.cs 89.7% <ø> (ø) ⬆️
...oft.ML.StandardTrainers/StandardTrainersCatalog.cs 92.34% <ø> (ø) ⬆️
...StandardTrainers/Standard/LinearModelParameters.cs 60.05% <0%> (-0.27%) ⬇️
src/Microsoft.ML.FastTree/TreeTrainersCatalog.cs 94.18% <0%> (ø) ⬆️
.../Microsoft.ML.Data/Transforms/ExtensionsCatalog.cs 100% <0%> (ø) ⬆️
src/Microsoft.ML.Data/Transforms/ColumnCopying.cs 85.43% <0%> (ø) ⬆️
test/Microsoft.ML.Functional.Tests/ModelFiles.cs 96.07% <0%> (ø) ⬆️
src/Microsoft.ML.DataView/IDataView.cs 100% <0%> (ø) ⬆️
src/Microsoft.ML.Transforms/NormalizerCatalog.cs 84.78% <0%> (ø) ⬆️
src/Microsoft.ML.Core/Data/Repository.cs 80.41% <0%> (+0.06%) ⬆️

@@ -383,11 +383,11 @@ public static class StandardTrainersCatalog
}

/// <summary>
/// Predict a target using a linear binary classification model trained with <see cref="AveragedPerceptronTrainer"/>.
/// Create an <see cref="AveragedPerceptronTrainer"/>, which predicts a target using a linear binary classification model trained over boolean label data.
Copy link
Member

@sfilipi sfilipi Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boolean label data [](start = 143, length = 18)

over a label of boolean data. #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Natalie voted for no change.


In reply to: 275001029 [](ancestors = 275001029)

@@ -420,7 +420,7 @@ public static class StandardTrainersCatalog
}

/// <summary>
/// Predict a target using a linear binary classification model trained with <see cref="AveragedPerceptronTrainer"/> and advanced options.
/// Create an <see cref="AveragedPerceptronTrainer"/> with advanced options, which predicts a target using a linear binary classification model trained over boolean label data.
Copy link
Member

@sfilipi sfilipi Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

over boolean label data [](start = 160, length = 23)

over a label of boolean data. #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Natalie voted for no change.


In reply to: 275001284 [](ancestors = 275001284)

@sfilipi
Copy link
Member

sfilipi commented Apr 12, 2019

/// multiplied by a factor 0 &lt; a &lt;= 1, called the learning rate. In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate,

no need to XML encode on md. YAY! :) #Resolved


Refers to: src/Microsoft.ML.StandardTrainers/Standard/Online/AveragedPerceptron.cs:53 in a9788b3. [](commit_id = a9788b3, deletion_comment = False)

///
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)]
///
/// ### Trainer FAQ
Copy link
Member

@sfilipi sfilipi Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trainer [](start = 12, length = 7)

shall we call it estimator? #WontFix

/// </summary>
/// <remarks>
/// <format type="text/markdown"><![CDATA[
/// To create this trainer, use [AveragedPerceptron](xref:Microsoft.ML.StandardTrainersCatalog.AveragedPerceptron(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,System.String,System.String,Microsoft.ML.Trainers.IClassificationLoss,System.Single,System.Boolean,System.Single,System.Int32)
Copy link
Member

@sfilipi sfilipi Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trainer [](start = 23, length = 7)

estimator? #WontFix

Copy link
Contributor

@natke natke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, aside from a couple of minor comments.

///
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)]
///
/// ### Trainer FAQ
Copy link
Contributor

@natke natke Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that we should call it estimator.

And how about Features (instead of FAQ), so that would be "Estimator Features"? #WontFix

@shmoradims
Copy link
Author

/// multiplied by a factor 0 &lt; a &lt;= 1, called the learning rate. In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate,

changed.


In reply to: 482660867 [](ancestors = 482660867)


Refers to: src/Microsoft.ML.StandardTrainers/Standard/Online/AveragedPerceptron.cs:53 in a9788b3. [](commit_id = a9788b3, deletion_comment = False)

/// | Machine learning task | Binary classification |
/// | Is normalization required? | Yes |
/// | Is caching required? | No |
/// | Additional required NuGet | None |
Copy link
Member

@sfilipi sfilipi Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| Additional required NuGet | None [](start = 8, length = 35)

this line is the one I am debating on the other PR.

If we want to keep it, shall we at least have it in the format:

| NuGet | Microsoft.ML |

Someone landing in this page from a web search would be clueless about what is the additional.

Copy link
Member

@sfilipi sfilipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -0,0 +1,8 @@
### Input and Output Columns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks the information below is not related to Input? Maybe you want to add the two types of inputs for binary classifiers --- single float column and multiple float columns (FFM).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the input information is the first line above the table about label. Because it doesn't have a fixed name (unlike output columns) it's just written in text.

FFM is a special binary classifier. For that we can add the multiple float input.


In reply to: 275443192 [](ancestors = 275443192)

/// | Is caching required? | No |
/// | Additional required NuGet | None |
///
/// ### Training Algorithm Details
/// The perceptron is a classification algorithm that makes its predictions by finding a separating hyperplane.
/// For instance, with feature values f0, f1,..., f_D-1, the prediction is given by determining what side of the hyperplane the point falls into.
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm.
Copy link
Member

@wschin wschin Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm.
/// That is the same as the sign of sigma[0, D-1] \sum_{i = 1}^n (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm and n is the length of feature vector.
``` #Resolved

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


In reply to: 275443917 [](ancestors = 275443917)

/// For instance, with feature values f0, f1,..., f_D-1, the prediction is given by determining what side of the hyperplane the point falls into.
/// That is the same as the sign of sigma[0, D-1] (w_i * f_i), where w_0, w_1,..., w_D-1 are the weights computed by the algorithm.
/// For instance, with feature values $f0, f1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into.
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm.
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_i$ is the i-th feature's coefficient computed by the algorithm.

@shmoradims shmoradims merged commit 681c60e into dotnet:master Apr 15, 2019
shmoradims pushed a commit to shmoradims/machinelearning that referenced this pull request Apr 16, 2019
)

* Updated trainer docs for AveragedPerceptron

* Addressed PR comments

* Updated xref

* Added more IO details.

* Updated nuget statement

* Fixed formula with latex syntax
@ghost ghost locked as resolved and limited conversation to collaborators Mar 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants