Updated xml docs for tree-based trainers. #2970

shmoradims · 2019-03-15T00:05:08Z

Updated XML documentation for tree-based trainers (FastTree, FastForest, GAM, etc). Related to #2522.

Samples to come in a separate PR.

zeahmed · 2019-03-15T00:20:01Z

src/Microsoft.ML.FastTree/doc.xml

@@ -24,7 +24,7 @@
          The output of the ensemble produced by MART on a given instance is the sum of the tree outputs.
        </para>
        <list type='bullet'>
-          <item><description>In case of a binary classification problem, the output is converted to a probability by using some form of calibration.</description></item>


case [](start = 32, length = 4)

is the deletion of word case by mistake? #Resolved

good catch

In reply to: 265813438 [](ancestors = 265813438)

zeahmed · 2019-03-15T00:20:53Z

src/Microsoft.ML.FastTree/doc.xml

-        This learner is a generalization of Poisson, compound Poisson, and gamma regression.
-      </summary>
+    <!--  
+    The following text describes the FastForest algorithm details.


FastForest [](start = 37, length = 10)

GAMs??? #Resolved

right, copy-paste error

In reply to: 265813553 [](ancestors = 265813553)

zeahmed · 2019-03-15T00:27:07Z

src/Microsoft.ML.FastTree/FastTreeArguments.cs

@@ -594,7 +643,7 @@ public abstract class BoostedTreeOptions : TreeOptions
        public bool BestStepRankingRegressionTrees = false;

        /// <summary>
-        /// Should we use line search for a step size.
+        /// Determines whether to we use line search for a step size.


to we use [](start = 31, length = 9)

to we use -> to use #Resolved

fixed

In reply to: 265814516 [](ancestors = 265814516)

zeahmed · 2019-03-15T00:29:44Z

src/Microsoft.ML.FastTree/FastTreeArguments.cs

@@ -611,11 +660,17 @@ public abstract class BoostedTreeOptions : TreeOptions
        [Argument(ArgumentType.LastOccurenceWins, HelpText = "Minimum line search step size", ShortName = "minstep")]
        public Double MinimumStepSize;

+        /// <summary>
+        /// The type of optimizer algorithm for setting <see cref="OptimizationAlgorithm"/>.


The type of optimizer algorithm for setting . [](start = 12, length = 80)

I think only Types of optimization algorithms. will make more sense here. #Resolved

codecov · 2019-03-15T00:38:36Z

Codecov Report

Merging #2970 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2970      +/-   ##
==========================================
+ Coverage   72.29%   72.29%   +<.01%     
==========================================
  Files         796      796              
  Lines      142349   142349              
  Branches    16051    16051              
==========================================
+ Hits       102905   102909       +4     
+ Misses      35063    35061       -2     
+ Partials     4381     4379       -2

Flag	Coverage Δ
#Debug	`72.29% <100%> (ø)`	⬆️
#production	`68.01% <100%> (ø)`	⬆️
#test	`88.48% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
src/Microsoft.ML.FastTree/GamClassification.cs	`89% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/FastTreeArguments.cs	`85.38% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/FastTreeRegression.cs	`54.5% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/TreeTrainersCatalog.cs	`94.18% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/FastTreeRanking.cs	`48.19% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/FastTreeTweedie.cs	`56.29% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/GamTrainer.cs	`90.38% <ø> (ø)`	⬆️
...rc/Microsoft.ML.FastTree/FastTreeClassification.cs	`78.19% <ø> (ø)`	⬆️
...rc/Microsoft.ML.FastTree/RandomForestRegression.cs	`59.9% <ø> (ø)`	⬆️
src/Microsoft.ML.FastTree/GamRegression.cs	`89.09% <ø> (ø)`	⬆️
... and 5 more

zeahmed

Overall it LGTM. I have left a few comments. Hope you will address those before merging.

singlis · 2019-03-15T00:48:53Z

/// The base class for all unsupervised learner inputs that support a weight column.

is it worth changing all "learner" instances to trainer in this file?
#Resolved

Refers to: src/Microsoft.ML.Data/Training/TrainerInputBase.cs:85 in 0966dca. [](commit_id = 0966dca, deletion_comment = False)

singlis · 2019-03-15T00:52:14Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Regression/FastTree.cs

+namespace Microsoft.ML.Samples.Dynamic.Trainers.Regression
+{
+    public static class FastTree
+    {


{ [](start = 4, length = 1)

So do we need a Binary and Ranking examples for FastTree? #Resolved

all the samples will come in my next PR. I added this one as template for discussion. please see my email about in-memory samples.

In reply to: 265818226 [](ancestors = 265818226)

singlis · 2019-03-15T01:01:59Z

    /// <param name="featureToInputMap">A map from the feature shape functions (as described by the binUpperBounds and BinEffects)

see crefs for binUpperBounds and binEffects. I know this is internal -- but it would bound to the variable name if they ever change. #Resolved

Refers to: src/Microsoft.ML.FastTree/GamClassification.cs:189 in 0966dca. [](commit_id = 0966dca, deletion_comment = False)

singlis

shmoradims · 2019-03-15T13:12:04Z

    /// <param name="featureToInputMap">A map from the feature shape functions (as described by the binUpperBounds and BinEffects)

done

In reply to: 473121003 [](ancestors = 473121003)

Refers to: src/Microsoft.ML.FastTree/GamClassification.cs:189 in 0966dca. [](commit_id = 0966dca, deletion_comment = False)

shmoradims · 2019-03-15T13:15:12Z

/// The base class for all unsupervised learner inputs that support a weight column.

done

In reply to: 473118846 [](ancestors = 473118846)

Refers to: src/Microsoft.ML.Data/Training/TrainerInputBase.cs:85 in 0966dca. [](commit_id = 0966dca, deletion_comment = False)

wschin · 2019-03-15T15:50:45Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Regression/FastTree.cs

+            var pipeline = mlContext.BinaryClassification.Trainers.FastTree();
+
+            // Train the model.
+            var model = pipeline.Fit(data);


This is minimal version without prediction. I personally like to see in-memory prediction, which is what will happen immediately in production.

Why in-memory prediction is important?
(1) User have no idea about the IDataView produced by the train model. If we don't tell them how to extract data into C# data structure, they will have to look for tutorials of IDataVIew, ITransformer, IDataView-C# bridge.
(2) Prediction format varies from different models and are ML.NET-specific, so it's also hard to figure out which one should be used.
(3) Prediction is how the trained model will be used. One might think scikit-learn doesn't do so, so we shouldn't. My suggestion is we should! Here is my reason ---- scikit-learn produces numpy data structures and everyone know how to manipulate them (by Googling for Numpy), but IDataView is not at that stage yet.

wschin · 2019-03-15T15:59:48Z

docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Regression/FastTree.cs

+            }
+        }
+
+        private class DataPoint


This is really good! An example with 50 features!

Updated xml docs for tree-based trainers.

0966dca

shmoradims requested review from sfilipi, rogancarr and zeahmed and removed request for sfilipi March 15, 2019 00:05

zeahmed reviewed Mar 15, 2019

View reviewed changes

zeahmed approved these changes Mar 15, 2019

View reviewed changes

singlis reviewed Mar 15, 2019

View reviewed changes

singlis approved these changes Mar 15, 2019

View reviewed changes

Addressed PR comments.

2153899

wschin reviewed Mar 15, 2019

View reviewed changes

shmoradims merged commit b6f94bc into dotnet:master Mar 15, 2019

shauheen added this to the 0319 milestone Mar 15, 2019

shmoradims deleted the tree_docs4 branch March 15, 2019 18:13

shmoradims mentioned this pull request Mar 18, 2019

Docs and samples for the API reference site (P0 & P1 Trainers) #2522

Closed

ghost locked as resolved and limited conversation to collaborators Mar 23, 2022

Updated xml docs for tree-based trainers. #2970

Updated xml docs for tree-based trainers. #2970

Uh oh!

Conversation

shmoradims commented Mar 15, 2019

Uh oh!

zeahmed Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shmoradims Mar 15, 2019

Choose a reason for hiding this comment

Uh oh!

zeahmed Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shmoradims Mar 15, 2019

Choose a reason for hiding this comment

Uh oh!

zeahmed Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shmoradims Mar 15, 2019

Choose a reason for hiding this comment

Uh oh!

zeahmed Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zeahmed left a comment

Choose a reason for hiding this comment

Uh oh!

singlis commented Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

singlis Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shmoradims Mar 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

singlis commented Mar 15, 2019 • edited by shmoradims Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

singlis left a comment

Choose a reason for hiding this comment

Uh oh!

shmoradims commented Mar 15, 2019

Uh oh!

shmoradims commented Mar 15, 2019

Uh oh!

wschin Mar 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wschin Mar 15, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zeahmed Mar 15, 2019 •

edited by shmoradims

Loading

zeahmed Mar 15, 2019 •

edited by shmoradims

Loading

zeahmed Mar 15, 2019 •

edited by shmoradims

Loading

zeahmed Mar 15, 2019 •

edited by shmoradims

Loading

codecov bot commented Mar 15, 2019 •

edited

Loading

singlis commented Mar 15, 2019 •

edited by shmoradims

Loading

singlis Mar 15, 2019 •

edited by shmoradims

Loading

shmoradims Mar 15, 2019 •

edited

Loading

singlis commented Mar 15, 2019 •

edited by shmoradims

Loading

wschin Mar 15, 2019 •

edited

Loading