Description
Currently API reference documentation for each trainer is split into two pages: 1) the creation method, and 2) the trainer estimator class. We also have a 3rd page for trainer options. In this issue, I want to reach a consensus about the content that goes in each page. The current proposal is as follows:
Page-1 - Creation extension methods
These methods act as the constructor for the trainer estimator class. There are two overloads per trainer and they're listed as extension methods in a MLContext trainer catalog. E.g. BinaryClassificationCatalog.BinaryClassificationTrainers.
Both overloads also show up in the same page for the extension class; we call this page-1 (e.g. LightGbm; please note that this page includes all LightGbm overloads including multiclass, ranking, etc and not just binary classification versions).
Summary
1-liner summary of what the trainer does, then "cref=the estimator class, i.e. page-2"
Training algorithm details are not here, and are included in page-2, so that other overloads of this APIs share the same content.
Remarks
[Gleb: add optional description for current overload]
Parameters
Parameters are defined
Example
One example is provided for this API (one per overload)
Page-2 - Trainer Estimator Class
This is the page for trainer estimator class. E.g. LightGbmBinaryTrainer
Summary
1-liner summary of what the trainer does with "cref=IEstimator(TTransformer)". [Gleb: add info on when it is good to use it. - answer the WHY question.] [Gleb: Add link to options in summary]
Training algorithm details are not in the summary.
Remarks
Note about creation: "For creating this trainer please see "cref to both overload methods from page-1"
Easy properties:
- Machine learning task: (redundant?)
- Expected label type: bool, etc
- Output columns: "Score", "PredictedLabel", etc with description of what each does
- Is normalization required? Yes/No
- Is caching required? Yes/No
- Is convertible to Onnx format? Yes/No
- Additional NuGet: "Link to NuGet" OR None of all that are included already in Microsoft.ML
Complex properties:
- Trainer Category (requires some taxonomy to be created)
- When to use this trainer? [what goes here?]
- Supported number of features?
- Supported number of examples?
Training algorithm details with all the reference links.
Example
Repeat example from overload-1 of page-1
Repeat example from overload-2 of page-1
See also
"cref to both overload methods from page-1"
[Gleb: link to the catalog with those learners?]
[Gleb: links to other similar learners?]
[Gleb: links to options?]
Page-3 - Trainer Options Class
This is the page for trainer options class that's used in one of the overloads in page-1. E.g. AveragedPerceptronTrainer.Options
Page-1 already links to this page from the type of the option parameter.
Summary
Options for "cref=page-2" as used in method "cref=page-1/overload-with-option".
Remarks
None
Example
None (page-1 already includes an example for using the options)
Parameters
Parameters are defined
See also
[Gleb: the factory methods, the estimator]