Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RegressionExpeirment using AutoMLExperiment #6338

Merged

Conversation

LittleLittleCloud
Copy link
Contributor

@LittleLittleCloud LittleLittleCloud commented Sep 26, 2022

We are excited to review your PR.

So we can do the best job, please check:

  • There's a descriptive title that will make sense to other developers some time from now.

  • There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.

  • Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.

  • You have included any necessary tests in the same PR.

  • AutoML.Net improvements tracking down list #6145

  • Update or deprecate AutoML 1.x APIs #6532

@LittleLittleCloud LittleLittleCloud changed the title use AutoMLExperiment in RegressionExperiment Improve RegressionExpeirment using AutoMLExperiment Sep 26, 2022
@codecov
Copy link

codecov bot commented Sep 26, 2022

Codecov Report

Merging #6338 (20fce65) into main (50e5068) will increase coverage by 0.08%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6338      +/-   ##
==========================================
+ Coverage   68.56%   68.64%   +0.08%     
==========================================
  Files        1171     1171              
  Lines      247736   247964     +228     
  Branches    25733    25738       +5     
==========================================
+ Hits       169858   170214     +356     
+ Misses      71115    71008     -107     
+ Partials     6763     6742      -21     
Flag Coverage Δ
Debug 68.64% <100.00%> (+0.08%) ⬆️
production 63.04% <ø> (+0.06%) ⬆️
test 89.11% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
test/Microsoft.ML.AutoML.Tests/AutoFitTests.cs 82.48% <100.00%> (+2.91%) ⬆️
src/Microsoft.ML.Core/Data/ProgressReporter.cs 70.95% <0.00%> (-6.99%) ⬇️
src/Microsoft.ML.Maml/MAML.cs 25.38% <0.00%> (-1.53%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 89.61% <0.00%> (-0.16%) ⬇️
...Microsoft.Data.Analysis.Tests/DataFrame.IOTests.cs 98.80% <0.00%> (+0.22%) ⬆️
src/Microsoft.ML.Data/Utils/LossFunctions.cs 67.35% <0.00%> (+0.51%) ⬆️
...oft.ML.StandardTrainers/Standard/SdcaMulticlass.cs 92.49% <0.00%> (+1.02%) ⬆️
src/Microsoft.ML.Sweeper/AsyncSweeper.cs 72.78% <0.00%> (+1.36%) ⬆️
src/Microsoft.Data.Analysis/DataFrame.IO.cs 81.15% <0.00%> (+1.92%) ⬆️
src/Microsoft.ML.Data/TrainCatalog.cs 82.70% <0.00%> (+2.53%) ⬆️
... and 5 more

[InlineData("en-US")]
[InlineData("ar-SA")]
[InlineData("pl-PL")]
public void AutoFitRegressionTest(string culture)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer using SMac Sweeper in regression, so we can just remove this test.

@@ -159,7 +159,7 @@ public override ExperimentResult<MulticlassClassificationMetrics> Execute(IDataV
// split cross validation result according to sample key as well.
if (rowCount < crossValRowCountThreshold)
{
const int numCrossValFolds = 10;
int numCrossValFolds = 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you have removed the const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no specific reason of doing that. I just feel like there's no need to mark this numCrossValFolds as const.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you have a variable in the first place then? I mean why you don't just call the method and passing 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice that,, fixed and Resolved

// Else, run experiment using train-validate split.
const int crossValRowCountThreshold = 15000;
var rowCount = DatasetDimensionsUtil.CountRows(trainData, crossValRowCountThreshold);
// TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

this will be done in another PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

module my minor questions, LGTM.

@@ -159,8 +159,7 @@ public override ExperimentResult<MulticlassClassificationMetrics> Execute(IDataV
// split cross validation result according to sample key as well.
if (rowCount < crossValRowCountThreshold)
{
const int numCrossValFolds = 10;
_experiment.SetDataset(trainData, numCrossValFolds);
_experiment.SetDataset(trainData, 10);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you may add the parameter name before 10 for code readability :-)

@LittleLittleCloud LittleLittleCloud merged commit 3965078 into dotnet:main Sep 29, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants