Proposal: Experiment API #6118

LittleLittleCloud · 2022-03-07T22:57:52Z

We are excited to review your PR.

So we can do the best job, please check:

There's a descriptive title that will make sense to other developers some time from now.
There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
You have included any necessary tests in the same PR.

This proposal provides an easy way to create and train an AutoML experiment using sweepable pipeline

#5993

A quick notebook example for Experiment API
AutoMLE2EWithTable.ipynb.txt

codecov · 2022-03-08T00:45:04Z

Codecov Report

Merging #6118 (f14a34f) into main (a79c620) will increase coverage by 0.40%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #6118      +/-   ##
==========================================
+ Coverage   68.22%   68.62%   +0.40%     
==========================================
  Files        1090     1142      +52     
  Lines      241442   246283    +4841     
  Branches    25149    25830     +681     
==========================================
+ Hits       164719   169009    +4290     
- Misses      70156    70638     +482     
- Partials     6567     6636      +69

Flag	Coverage Δ
Debug	`68.62% <ø> (+0.40%)`	⬆️
production	`63.02% <ø> (+0.27%)`	⬆️
test	`89.19% <ø> (+0.45%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/Microsoft.ML.OnnxTransformer/OnnxCatalog.cs	`90.90% <0.00%> (-9.10%)`	⬇️
test/Microsoft.ML.TestFramework/GlobalBase.cs	`30.00% <0.00%> (-2.36%)`	⬇️
src/Microsoft.ML.Maml/MAML.cs	`25.38% <0.00%> (-1.53%)`	⬇️
...a.Analysis.Interactive/DataFrameKernelExtension.cs	`95.10% <0.00%> (-0.87%)`	⬇️
...Microsoft.ML.Data/Transforms/MetadataDispatcher.cs	`62.32% <0.00%> (-0.84%)`	⬇️
src/Microsoft.ML.Core/Data/IHostEnvironment.cs	`96.82% <0.00%> (-0.74%)`	⬇️
src/Microsoft.ML.Data/MLContext.cs	`90.32% <0.00%> (-0.16%)`	⬇️
...rc/Microsoft.ML.Data/Scorers/RowToRowScorerBase.cs	`85.94% <0.00%> (-0.01%)`	⬇️
src/Microsoft.ML.SearchSpace/Option/OptionBase.cs	`100.00% <0.00%> (ø)`
src/Microsoft.ML.SearchSpace/Tuner/RandomTuner.cs	`0.00% <0.00%> (ø)`
... and 171 more

luisquintanilla

Looks great. Added some comments after an initial pass.

luisquintanilla · 2022-03-10T00:46:05Z

docs/specs/AutoML Experiment API Proposal.md

+// Experiment api.
+var pipeline, tuner;
+
+var experiment = pipeline.CreateExperiment(trainTime = 100, trainDataset = "train.csv", split = "cv", folds = 10, metric = "AUC", tuner = tuner, monitor = monitor);


It'd be great to have an overload that takes in a class ExperimentOptions where you can define these parameters:

var experimentOptions = new ExperimentOptions { TrainTime = 100, TrainDataset = "train.csv", Split = "cv" Folds = 10, Metric = "AUC", tuner = tuner, monitor = monitor }; var experiment = pipeline.CreateExperiment(experimentOptions);

For the split parameter, it'd be good to have something like an enum of some sort so you have the option of using it like this:

Split = Split.CV

Same thing for the metric, it'd be great to leverage the existing metric classes. For example, for binary classification:

Metric = BinaryClassificationMetrics.AreaUnderRocCurve

I would hold my opinion on metric class (using existing metric class). Using string will actually be easier because

mlnet have different metric classes for different scenario, which makes it hard for coding unless we end up with having different api for creating experiments for different scenarios as well

using string allows us to add metrics that not supported by mlnet. (like forecasting, which doesn't have a evaluate metric)

The only downside is little restriction on input, but that can be solved by documents or having a metric enum instead

michaelgsharp · 2022-07-13T01:01:14Z

/azp run

azure-pipelines · 2022-07-13T01:01:26Z

Azure Pipelines successfully started running 2 pipeline(s).

add proposal

f14a34f

LittleLittleCloud requested review from michaelgsharp and JakeRadMSFT March 7, 2022 22:58

JakeRadMSFT mentioned this pull request Mar 8, 2022

"Unable to parse file" exception when trying to open CSV file with many columns dotnet/machinelearning-modelbuilder#1986

Closed

luisquintanilla reviewed Mar 10, 2022

View reviewed changes

This was referenced Mar 28, 2022

add automl experiment api && cfo tuner #6140

Merged

AutoML.Net improvements tracking down list #6145

Open

michaelgsharp approved these changes Jul 13, 2022

View reviewed changes

michaelgsharp merged commit da2df59 into dotnet:main Jul 13, 2022

ghost locked as resolved and limited conversation to collaborators Aug 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Experiment API #6118

Proposal: Experiment API #6118

Uh oh!

LittleLittleCloud commented Mar 7, 2022 •

edited

Loading

Uh oh!

codecov bot commented Mar 8, 2022 •

edited

Loading

Uh oh!

luisquintanilla left a comment

Uh oh!

luisquintanilla Mar 10, 2022

Uh oh!

LittleLittleCloud Mar 10, 2022

Uh oh!

michaelgsharp commented Jul 13, 2022

Uh oh!

azure-pipelines bot commented Jul 13, 2022

Uh oh!

Uh oh!

Proposal: Experiment API #6118

Proposal: Experiment API #6118

Uh oh!

Conversation

LittleLittleCloud commented Mar 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

luisquintanilla left a comment

Choose a reason for hiding this comment

Uh oh!

luisquintanilla Mar 10, 2022

Choose a reason for hiding this comment

Uh oh!

LittleLittleCloud Mar 10, 2022

Choose a reason for hiding this comment

Uh oh!

michaelgsharp commented Jul 13, 2022

Uh oh!

azure-pipelines bot commented Jul 13, 2022

Uh oh!

Uh oh!

LittleLittleCloud commented Mar 7, 2022 •

edited

Loading

codecov bot commented Mar 8, 2022 •

edited

Loading