Closed
Description
System Information (please complete the following information):
- OS & Version: Windows 11
- ML.NET Version: 2.0
- .NET Version: .NET6
Describe the bug
Running the "old" AutoML experiments does not work for all trainers. Using CreateMulticlassClassificationExperiment
or CreateRegressionExperiment
instead of the new declarative style will result in an exception (possibly due to custom schemas).
An exception will be raised in the new SweepablePipeline:
System.NullReferenceException: Object reference not set to an instance of an object.
at Microsoft.ML.AutoML.SweepablePipeline..ctor(Dictionary`2 estimators, Entity schema, String currentSchema)
at Microsoft.ML.AutoML.SweepablePipeline.AppendEntity(Boolean allowSkip, Entity entity)
at Microsoft.ML.AutoML.AutoCatalog.MultiClassification(String labelColumnName, String featureColumnName, String exampleWeightColumnName, Boolean useFastForest, Boolean useLgbm, Boolean useFastTree, Boolean useLbfgs, Boolean useSdca, FastTreeOption fastTreeOption, LgbmOption lgbmOption, FastForestOption fastForestOption, LbfgsOption lbfgsOption, SdcaOption sdcaOption, SearchSpace`1 fastTreeSearchSpace, SearchSpace`1 lgbmSearchSpace, SearchSpace`1 fastForestSearchSpace, SearchSpace`1 lbfgsSearchSpace, SearchSpace`1 sdcaSearchSpace)
at Microsoft.ML.AutoML.MulticlassClassificationExperiment.CreateMulticlassClassificationPipeline(IDataView trainData, ColumnInformation columnInformation, IEstimator`1 preFeaturizer)
at Microsoft.ML.AutoML.MulticlassClassificationExperiment.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)
at Microsoft.ML.AutoML.MulticlassClassificationExperiment.Execute(IDataView trainData, String labelColumnName, String samplingKeyColumn, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)
To Reproduce
var experimentSettings = new MulticlassExperimentSettings
{
MaxExperimentTimeInSeconds = 30,
OptimizingMetric = MulticlassClassificationMetric.MacroAccuracy
};
experimentSettings.Trainers.Clear();
experimentSettings.Trainers.Add(MulticlassClassificationTrainer.LbfgsMaximumEntropy);
ctx.Auto().CreateMulticlassClassificationExperiment(experimentSettings).Execute(trainDv);
Where the schema is of a custom type:
var schemaDef = SchemaDefinition.Create(typeof(ModelInput));
schemaDef["Features"].ColumnType = new VectorDataViewType(NumberDataViewType.Single, numberOfFeatures);
schemaDef.Remove(schemaDef["LabelFeaturized"]); // removed when not applicable
public class ModelInput
{
public uint Label;
public float LabelFeaturized;
public float[] Features;
}
Expected behavior
No regression expected for CreateMulticlassClassificationExperiment
and CreateRegressionExperiment
.