SymSGD IndexOutOfRangeException

I get an error when using OVA-SymSGD on an internal dataset. Other learners, like SDCA and OVA-AveragedPerceptron work successfully (though LightGBM dies due to https://github.com/dotnet/machinelearning/issues/1625).

## Error:
```md
Exception: System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at Microsoft.ML.Trainers.SymbolicSgdLogisticRegressionBinaryTrainer.Native.LearnAll(InputDataManager inputDataManager, Boolean tuneLR, Single& lr, Single l2Const, Single piw, Span`1 weightVector, Single& bias, Int32 numFeatres, Int32 numPasses, Int32 numThreads, Boolean tuneNumLocIter, Int32& numLocIter, Single tolerance, Boolean needShuffle, Boolean shouldInitialize, GCHandle stateGCHandle, ChannelCallBack info)
   at Microsoft.ML.Trainers.SymbolicSgdLogisticRegressionBinaryTrainer.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount)
   at Microsoft.ML.Trainers.SymbolicSgdLogisticRegressionBinaryTrainer.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Trainers.OneVersusAllTrainer.TrainOne(IChannel ch, ITrainerEstimator`2 trainer, RoleMappedData data, Int32 cls)
   at Microsoft.ML.Trainers.OneVersusAllTrainer.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String labelColumn, IMetricsAgent`1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, AutoMLLogger logger)
```

## Pipeline
Below is the same pipeline but using SDCA, which runs successfully.
```C#
var dataProcessPipeline = mlContext.Transforms.Conversion.MapValueToKey("label_col", "label_col")
                  .Append(mlContext.Transforms.Categorical.OneHotEncoding(new[] { new InputOutputColumnPair("col1", "col1"), new InputOutputColumnPair("col2", "col2"), new InputOutputColumnPair("col3", "col3"), new InputOutputColumnPair("col4", "col4"), new InputOutputColumnPair("col5", "col5") }))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col6_tf", "col6"))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col7_tf", "col7"))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col8_tf", "col8"))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col9_tf", "col9"))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col10_tf", "col10"))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col11_tf", "col11"))
                  .Append(mlContext.Transforms.Text.FeaturizeText("col12_tf", "col12"))
                  .Append(mlContext.Transforms.Concatenate("Features", new[] { "col1", "col2", "col3", "col4", "col5", "col6_tf", "col7_tf", "col8_tf", "col9_tf", "col10_tf", "col11_tf", "col12_tf", "col13", "col14", "col15", "col16", "col17", "col18", "col19", "col20", "col21" }))
                  .Append(mlContext.Transforms.NormalizeMinMax("Features", "Features"))
                  .AppendCacheCheckpoint(mlContext);

            // Set the training algorithm 
            var trainer = mlContext.MulticlassClassification.Trainers.SdcaMaximumEntropy(labelColumnName: "label_col", featureColumnName: "Features")
                  .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel", "PredictedLabel"));
            var trainingPipeline = dataProcessPipeline.Append(trainer);
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SymSGD IndexOutOfRangeException #3887

Error:

Pipeline

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SymSGD IndexOutOfRangeException #3887

Description

Error:

Pipeline

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions