Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed default value of RowGroupColumnName from null to GroupId #5290

Merged

Conversation

michaelgsharp
Copy link
Member

This is the final pr to fix issue #4749, changes the default value of RowGroupColumnName from null to GroupId.

@michaelgsharp michaelgsharp requested review from justinormont and a team July 7, 2020 16:55
@michaelgsharp michaelgsharp self-assigned this Jul 7, 2020
@michaelgsharp
Copy link
Member Author

@justinormont This is the final pr needed to close the changing default issue. Per our discussion, I have updated the test that was missing the GroupId column to include that column.

@codecov
Copy link

codecov bot commented Jul 7, 2020

Codecov Report

Merging #5290 into master will decrease coverage by 3.98%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5290      +/-   ##
==========================================
- Coverage   73.49%   69.51%   -3.99%     
==========================================
  Files        1014      779     -235     
  Lines      188680   145816   -42864     
  Branches    20330    18576    -1754     
==========================================
- Hits       138677   101369   -37308     
+ Misses      44493    39276    -5217     
+ Partials     5510     5171     -339     
Flag Coverage Δ
#Debug 69.51% <100.00%> (-3.99%) ⬇️
#production 69.51% <100.00%> (+0.21%) ⬆️
#test ?
Impacted Files Coverage Δ
src/Microsoft.ML.FastTree/FastTreeArguments.cs 85.49% <100.00%> (+0.11%) ⬆️
...rc/Microsoft.ML.LightGbm/LightGbmRankingTrainer.cs 88.37% <100.00%> (+9.17%) ⬆️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 43.45% <100.00%> (+3.45%) ⬆️
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0.00%> (-20.52%) ⬇️
src/Microsoft.ML.FastTree/RegressionTree.cs 75.51% <0.00%> (-8.17%) ⬇️
src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs 78.92% <0.00%> (-6.07%) ⬇️
src/Microsoft.ML.AutoML/API/AutoCatalog.cs 69.35% <0.00%> (-4.84%) ⬇️
src/Microsoft.ML.TimeSeries/ExtensionsCatalog.cs 90.62% <0.00%> (-3.00%) ⬇️
src/Microsoft.ML.Maml/MAML.cs 23.78% <0.00%> (-2.43%) ⬇️
...crosoft.ML.TimeSeries/RootCauseLocalizationType.cs 48.31% <0.00%> (-1.12%) ⬇️
... and 283 more

@codecov
Copy link

codecov bot commented Jul 7, 2020

Codecov Report

Merging #5290 into master will increase coverage by 0.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5290      +/-   ##
==========================================
+ Coverage   73.49%   73.75%   +0.25%     
==========================================
  Files        1014     1022       +8     
  Lines      188680   190340    +1660     
  Branches    20330    20471     +141     
==========================================
+ Hits       138677   140384    +1707     
+ Misses      44493    44440      -53     
- Partials     5510     5516       +6     
Flag Coverage Δ
#Debug 73.75% <100.00%> (+0.25%) ⬆️
#production 69.51% <100.00%> (+0.21%) ⬆️
#test 87.62% <100.00%> (+0.18%) ⬆️
Impacted Files Coverage Δ
src/Microsoft.ML.FastTree/FastTreeArguments.cs 85.49% <100.00%> (+0.11%) ⬆️
...rc/Microsoft.ML.LightGbm/LightGbmRankingTrainer.cs 88.37% <100.00%> (+9.17%) ⬆️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 43.45% <100.00%> (+3.45%) ⬆️
...ts/TrainerEstimators/TreeEnsembleFeaturizerTest.cs 99.66% <100.00%> (ø)
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0.00%> (-20.52%) ⬇️
src/Microsoft.ML.FastTree/RegressionTree.cs 75.51% <0.00%> (-8.17%) ⬇️
src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs 78.92% <0.00%> (-6.07%) ⬇️
src/Microsoft.ML.AutoML/API/AutoCatalog.cs 69.35% <0.00%> (-4.84%) ⬇️
test/Microsoft.ML.CodeGenerator.Tests/UtilTest.cs 73.14% <0.00%> (-4.05%) ⬇️
src/Microsoft.ML.TimeSeries/ExtensionsCatalog.cs 90.62% <0.00%> (-3.00%) ⬇️
... and 56 more

@frank-dong-ms-zz
Copy link
Contributor

frank-dong-ms-zz commented Jul 7, 2020

Could you please share what's the impact of this change? Does it lead to faster training or accurate result or anything else? #WontFix

@michaelgsharp
Copy link
Member Author

Hey Frank, issue #4749 talks more about it, but basically we want to have good default values so that the default experience for end users is good. If you do tree training without a group, it has bad results. By making the default use a group, then the default will be better results. In fact, only 1 of our tests didn't have a group, the rest did have a group for that reason.

@codecov
Copy link

codecov bot commented Jul 7, 2020

Codecov Report

Merging #5290 into master will increase coverage by 0.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5290      +/-   ##
==========================================
+ Coverage   73.49%   73.75%   +0.25%     
==========================================
  Files        1014     1022       +8     
  Lines      188680   190340    +1660     
  Branches    20330    20471     +141     
==========================================
+ Hits       138677   140384    +1707     
+ Misses      44493    44440      -53     
- Partials     5510     5516       +6     
Flag Coverage Δ
#Debug 73.75% <100.00%> (+0.25%) ⬆️
#production 69.51% <100.00%> (+0.21%) ⬆️
#test 87.62% <100.00%> (+0.18%) ⬆️
Impacted Files Coverage Δ
src/Microsoft.ML.FastTree/FastTreeArguments.cs 85.49% <100.00%> (+0.11%) ⬆️
...rc/Microsoft.ML.LightGbm/LightGbmRankingTrainer.cs 88.37% <100.00%> (+9.17%) ⬆️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 43.45% <100.00%> (+3.45%) ⬆️
...ts/TrainerEstimators/TreeEnsembleFeaturizerTest.cs 99.66% <100.00%> (ø)
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0.00%> (-20.52%) ⬇️
src/Microsoft.ML.FastTree/RegressionTree.cs 75.51% <0.00%> (-8.17%) ⬇️
src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs 78.92% <0.00%> (-6.07%) ⬇️
src/Microsoft.ML.AutoML/API/AutoCatalog.cs 69.35% <0.00%> (-4.84%) ⬇️
test/Microsoft.ML.CodeGenerator.Tests/UtilTest.cs 73.14% <0.00%> (-4.05%) ⬇️
src/Microsoft.ML.TimeSeries/ExtensionsCatalog.cs 90.62% <0.00%> (-3.00%) ⬇️
... and 56 more

@codecov
Copy link

codecov bot commented Jul 7, 2020

Codecov Report

Merging #5290 into master will increase coverage by 0.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5290      +/-   ##
==========================================
+ Coverage   73.49%   73.75%   +0.25%     
==========================================
  Files        1014     1022       +8     
  Lines      188680   190340    +1660     
  Branches    20330    20471     +141     
==========================================
+ Hits       138677   140388    +1711     
+ Misses      44493    44439      -54     
- Partials     5510     5513       +3     
Flag Coverage Δ
#Debug 73.75% <100.00%> (+0.25%) ⬆️
#production 69.52% <100.00%> (+0.21%) ⬆️
#test 87.62% <100.00%> (+0.18%) ⬆️
Impacted Files Coverage Δ
src/Microsoft.ML.FastTree/FastTreeArguments.cs 85.49% <100.00%> (+0.11%) ⬆️
...rc/Microsoft.ML.LightGbm/LightGbmRankingTrainer.cs 88.37% <100.00%> (+9.17%) ⬆️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 43.45% <100.00%> (+3.45%) ⬆️
...ts/TrainerEstimators/TreeEnsembleFeaturizerTest.cs 99.66% <100.00%> (ø)
...c/Microsoft.ML.FastTree/Utils/ThreadTaskManager.cs 79.48% <0.00%> (-20.52%) ⬇️
src/Microsoft.ML.FastTree/RegressionTree.cs 75.51% <0.00%> (-8.17%) ⬇️
src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs 78.92% <0.00%> (-6.07%) ⬇️
src/Microsoft.ML.AutoML/API/AutoCatalog.cs 69.35% <0.00%> (-4.84%) ⬇️
test/Microsoft.ML.CodeGenerator.Tests/UtilTest.cs 73.14% <0.00%> (-4.05%) ⬇️
src/Microsoft.ML.TimeSeries/ExtensionsCatalog.cs 90.62% <0.00%> (-3.00%) ⬇️
... and 58 more

@frank-dong-ms-zz
Copy link
Contributor

Thanks, got it


In reply to: 655049589 [](ancestors = 655049589)

Copy link
Contributor

@frank-dong-ms-zz frank-dong-ms-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@michaelgsharp michaelgsharp merged commit b628918 into dotnet:master Jul 7, 2020
@@ -7675,7 +7675,7 @@
"Required": false,
"SortOrder": 5.0,
"IsNullable": false,
"Default": null
Copy link
Member

@antoniovs1029 antoniovs1029 Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Just curious, was the manifest automatically updated simply by adding the constructor logic you added?

@michaelgsharp michaelgsharp deleted the row-group-column-name-default branch November 4, 2020 20:35
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants