Skip to content

CrossValidation RegressionMetrics has two more entries than number of folds #570

Closed
@jwood803

Description

@jwood803

I might be incorrect on this, but wouldn't the number of RegressionMetrics be the same as the number of folds specified in the CrossValidator?

System information

  • Windows 10
  • .NET Version (eg., dotnet --info):
.NET Command Line Tools (2.1.202)

Product Information:
 Version:            2.1.202
 Commit SHA-1 hash:  281caedada

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.17134
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\2.1.202\

Microsoft .NET Core Shared Framework Host

  Version  : 2.0.9
  Build    : 1632fa1589b0eee3277a8841ce1770e554ece037

Issue

  • What did you do?
    Ran cross validation against a pipeline.
  • What happened?
    The RegressionMetrics count is two more than the number of folds specified.
  • What did you expect?
    The RegressionMetrics count to be the same as the number of folds.

Source code / logs

var pipeline = new LearningPipeline
{
    new TextLoader(dataset).CreateFrom<SalaryData>(useHeader: true, separator: ','),
    new ColumnConcatenator("Features", "YearsExperience"),
    new GeneralizedAdditiveModelRegressor()
};

var crossValidator = new CrossValidator()
{
    Kind = MacroUtilsTrainerKinds.SignatureRegressorTrainer,
    NumFolds = 2
};
var crossValidatorOutput = crossValidator.CrossValidate<SalaryData, SalaryPrediction>(pipeline);

crossValidatorOutput.RegressionMetrics.ForEach(m => Console.WriteLine(m.Rms));

2018-07-20 14_44_23-clipboard

2018-07-20 14_45_12-clipboard

Looking at sklearn, it seems to have the same number of results as the number of folds:

from sklearn.model_selection import cross_val_score

cross_val_score(lin_reg, train_set, train_labels, cv=3)
array([0.96044449, 0.97351702, 0.92777218])

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIIssues pertaining the friendly API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions