Closed
Description
I might be incorrect on this, but wouldn't the number of RegressionMetrics
be the same as the number of folds specified in the CrossValidator
?
System information
- Windows 10
- .NET Version (eg., dotnet --info):
.NET Command Line Tools (2.1.202)
Product Information:
Version: 2.1.202
Commit SHA-1 hash: 281caedada
Runtime Environment:
OS Name: Windows
OS Version: 10.0.17134
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\2.1.202\
Microsoft .NET Core Shared Framework Host
Version : 2.0.9
Build : 1632fa1589b0eee3277a8841ce1770e554ece037
Issue
- What did you do?
Ran cross validation against a pipeline. - What happened?
TheRegressionMetrics
count is two more than the number of folds specified. - What did you expect?
TheRegressionMetrics
count to be the same as the number of folds.
Source code / logs
var pipeline = new LearningPipeline
{
new TextLoader(dataset).CreateFrom<SalaryData>(useHeader: true, separator: ','),
new ColumnConcatenator("Features", "YearsExperience"),
new GeneralizedAdditiveModelRegressor()
};
var crossValidator = new CrossValidator()
{
Kind = MacroUtilsTrainerKinds.SignatureRegressorTrainer,
NumFolds = 2
};
var crossValidatorOutput = crossValidator.CrossValidate<SalaryData, SalaryPrediction>(pipeline);
crossValidatorOutput.RegressionMetrics.ForEach(m => Console.WriteLine(m.Rms));
Looking at sklearn
, it seems to have the same number of results as the number of folds:
from sklearn.model_selection import cross_val_score
cross_val_score(lin_reg, train_set, train_labels, cv=3)
array([0.96044449, 0.97351702, 0.92777218])