Skip to content

MulticlassClassification.CrossValidate Arithmetic operation resulted in an overflow #5211

@DFMERA

Description

@DFMERA

System information

  • OS version/distro:
    Windows 10 pro
  • .NET Version (eg., dotnet --info):
    .Net Core 2.1.0

Issue

  • What did you do?
    I am creating a multiclass classification experiment and after de best model is selected and I try to evaluate de model but this method throws an exception
    var testMetrics = mlContext.MulticlassClassification.CrossValidate(testDataViewWithBestScore, bestRun.Estimator, numberOfFolds: 5, labelColumnName: "reservation_status");

  • What happened?
    The mlContext.MulticlassClassification.CrossValidate throws an exception

  • What did you expect?
    To recover the metrics of the model on test data

Source code / logs

CODE

var tmpPath = GetAbsolutePath(TRAIN_DATA_FILEPATH);
IDataView trainingDataView = mlContext.Data.LoadFromTextFile(
path: tmpPath,
hasHeader: true,
separatorChar: '\t',
allowQuoting: true,
allowSparse: false);

        IDataView testDataView = mlContext.Data.BootstrapSample(trainingDataView);

// STEP 2: Run AutoML experiment
Console.WriteLine($"Running AutoML Multiclass classification experiment for {ExperimentTime} seconds...");
ExperimentResult experimentResult = mlContext.Auto()
.CreateMulticlassClassificationExperiment(ExperimentTime)
.Execute(trainingDataView, labelColumnName: "reservation_status");

        // STEP 3: Print metric from the best model
        RunDetail<MulticlassClassificationMetrics> bestRun = experimentResult.BestRun;
        Console.WriteLine($"Total models produced: {experimentResult.RunDetails.Count()}");
        Console.WriteLine($"Best model's trainer: {bestRun.TrainerName}");
        Console.WriteLine($"Metrics of best model from validation data --");
        PrintMulticlassClassificationMetrics(bestRun.ValidationMetrics);

        // STEP 4: Evaluate test data
        IDataView testDataViewWithBestScore = bestRun.Model.Transform(testDataView);
        var testMetrics = mlContext.MulticlassClassification.CrossValidate(testDataViewWithBestScore, bestRun.Estimator, numberOfFolds: 5, labelColumnName: "reservation_status");

EXCEPTION

Unhandled Exception: System.OverflowException: Arithmetic operation resulted in an overflow.
at Microsoft.ML.Data.VectorDataViewType.ComputeSize(ImmutableArray1 dims) at Microsoft.ML.Data.VectorDataViewType..ctor(PrimitiveDataViewType itemType, Int32[] dimensions) at Microsoft.ML.Transforms.KeyToVectorMappingTransformer.Mapper..ctor(KeyToVectorMappingTransformer parent, DataViewSchema inputSchema) at Microsoft.ML.Transforms.KeyToVectorMappingTransformer.MakeRowMapper(DataViewSchema schema) at Microsoft.ML.Data.RowToRowTransformerBase.GetOutputSchema(DataViewSchema inputSchema) at Microsoft.ML.Data.TrivialEstimator1.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.Transforms.OneHotHashEncodingTransformer..ctor(HashingEstimator hash, IEstimator1 keyToVector, IDataView input)
at Microsoft.ML.Transforms.OneHotHashEncodingEstimator.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input)
at Microsoft.ML.TrainCatalogBase.CrossValidateTrain(IDataView data, IEstimator1 estimator, Int32 numFolds, String samplingKeyColumn, Nullable1 seed)
at Microsoft.ML.MulticlassClassificationCatalog.CrossValidate(IDataView data, IEstimator1 estimator, Int32 numberOfFolds, String labelColumnName, String samplingKeyColumnName, Nullable1 seed)
at ConsoleAppML2ML.ConsoleApp.ModelBuilder.CreateExperiment() in C:\repos\Curso ML\Bootcamp-Handytec\Clasificacion_multiclase\ConsoleAppML2\ConsoleAppML2ML.ConsoleApp\ModelBuilder.cs:line 77
at ConsoleAppML2ML.ConsoleApp.Program.Main(String[] args) in C:\repos\Curso ML\Bootcamp-Handytec\Clasificacion_multiclase\ConsoleAppML2\ConsoleAppML2ML.ConsoleApp\Program.cs:line 20
HotelBookings.tsv.zip

Metadata

Metadata

Assignees

Labels

P1Priority of the issue for triage purpose: Needs to be fixed soon.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions