Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aded catch in R^2 calculation for case with few samples #5319

Merged
merged 4 commits into from
Jul 26, 2020

Conversation

mstfbl
Copy link
Contributor

@mstfbl mstfbl commented Jul 22, 2020

This PR fixes #5306 by adding a catch for the calculation of R^2 during metric calculation. When there is less than two rows of data used for the calculation of R^2, the returned value becomes -Infinity, whereas it should be returning NaN.

An example of this behavior in another ML framework is in scikit-learn, in the following lines:

https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/metrics/_regression.py#L587-L590

@mstfbl mstfbl marked this pull request as ready for review July 22, 2020 18:09
@mstfbl mstfbl requested a review from a team as a code owner July 22, 2020 18:09
@codecov
Copy link

codecov bot commented Jul 22, 2020

Codecov Report

Merging #5319 into master will decrease coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5319      +/-   ##
==========================================
- Coverage   73.98%   73.91%   -0.08%     
==========================================
  Files        1019     1019              
  Lines      190083   190101      +18     
  Branches    20437    20438       +1     
==========================================
- Hits       140641   140509     -132     
- Misses      43924    44060     +136     
- Partials     5518     5532      +14     
Flag Coverage Δ
#Debug 73.91% <100.00%> (-0.08%) ⬇️
#production 69.67% <100.00%> (-0.11%) ⬇️
#test 87.67% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...icrosoft.ML.Data/Evaluators/RegressionEvaluator.cs 78.92% <100.00%> (+1.40%) ⬆️
test/Microsoft.ML.Functional.Tests/Validation.cs 100.00% <100.00%> (ø)
...osoft.ML.KMeansClustering/KMeansPlusPlusTrainer.cs 83.60% <0.00%> (-7.27%) ⬇️
src/Microsoft.ML.FastTree/Training/StepSearch.cs 57.42% <0.00%> (-4.96%) ⬇️
src/Microsoft.ML.Data/Training/TrainerUtils.cs 66.86% <0.00%> (-3.82%) ⬇️
...crosoft.ML.StandardTrainers/Standard/SdcaBinary.cs 85.23% <0.00%> (-3.25%) ⬇️
...rosoft.ML.AutoML/ColumnInference/TextFileSample.cs 59.60% <0.00%> (-2.65%) ⬇️
...L.AutoML/TrainerExtensions/TrainerExtensionUtil.cs 84.71% <0.00%> (-1.66%) ⬇️
src/Microsoft.ML.Sweeper/AsyncSweeper.cs 71.42% <0.00%> (-1.37%) ⬇️
...crosoft.ML.StandardTrainers/Optimizer/Optimizer.cs 71.96% <0.00%> (-1.16%) ⬇️
... and 6 more

src/Microsoft.ML.Data/Evaluators/RegressionEvaluator.cs Outdated Show resolved Hide resolved
@mstfbl mstfbl marked this pull request as draft July 23, 2020 16:45
@mstfbl mstfbl marked this pull request as ready for review July 25, 2020 01:52
@mstfbl mstfbl requested a review from Lynx1820 July 25, 2020 01:59
@mstfbl mstfbl requested a review from Lynx1820 July 26, 2020 01:25
Copy link
Contributor

@justinormont justinormont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a comment above about the unit test.

@mstfbl mstfbl dismissed Lynx1820’s stale review July 26, 2020 06:06

Added requested unit test.

@mstfbl mstfbl requested a review from harishsk July 26, 2020 07:49
@harishsk harishsk merged commit 2f0af7e into dotnet:master Jul 26, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Index out of range exception in execute
4 participants