Skip to content

AutoML experiments in non declarative style not working #6446

Closed
@thoron

Description

@thoron

System Information (please complete the following information):

  • OS & Version: Windows 11
  • ML.NET Version: 2.0
  • .NET Version: .NET6

Describe the bug
Running the "old" AutoML experiments does not work for all trainers. Using CreateMulticlassClassificationExperiment or CreateRegressionExperiment instead of the new declarative style will result in an exception (possibly due to custom schemas).

An exception will be raised in the new SweepablePipeline:

System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.ML.AutoML.SweepablePipeline..ctor(Dictionary`2 estimators, Entity schema, String currentSchema)
   at Microsoft.ML.AutoML.SweepablePipeline.AppendEntity(Boolean allowSkip, Entity entity)
   at Microsoft.ML.AutoML.AutoCatalog.MultiClassification(String labelColumnName, String featureColumnName, String exampleWeightColumnName, Boolean useFastForest, Boolean useLgbm, Boolean useFastTree, Boolean useLbfgs, Boolean useSdca, FastTreeOption fastTreeOption, LgbmOption lgbmOption, FastForestOption fastForestOption, LbfgsOption lbfgsOption, SdcaOption sdcaOption, SearchSpace`1 fastTreeSearchSpace, SearchSpace`1 lgbmSearchSpace, SearchSpace`1 fastForestSearchSpace, SearchSpace`1 lbfgsSearchSpace, SearchSpace`1 sdcaSearchSpace)
   at Microsoft.ML.AutoML.MulticlassClassificationExperiment.CreateMulticlassClassificationPipeline(IDataView trainData, ColumnInformation columnInformation, IEstimator`1 preFeaturizer)
   at Microsoft.ML.AutoML.MulticlassClassificationExperiment.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)
   at Microsoft.ML.AutoML.MulticlassClassificationExperiment.Execute(IDataView trainData, String labelColumnName, String samplingKeyColumn, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)

To Reproduce

var experimentSettings = new MulticlassExperimentSettings
{
  MaxExperimentTimeInSeconds = 30,
  OptimizingMetric = MulticlassClassificationMetric.MacroAccuracy
};
experimentSettings.Trainers.Clear();
experimentSettings.Trainers.Add(MulticlassClassificationTrainer.LbfgsMaximumEntropy);
ctx.Auto().CreateMulticlassClassificationExperiment(experimentSettings).Execute(trainDv);

Where the schema is of a custom type:

var schemaDef = SchemaDefinition.Create(typeof(ModelInput));
schemaDef["Features"].ColumnType = new VectorDataViewType(NumberDataViewType.Single, numberOfFeatures);
schemaDef.Remove(schemaDef["LabelFeaturized"]); // removed when not applicable
public class ModelInput
{
  public uint Label;
  public float LabelFeaturized;
  public float[] Features;
}

Expected behavior
No regression expected for CreateMulticlassClassificationExperiment and CreateRegressionExperiment.

Metadata

Metadata

Labels

AutoML.NETAutomating various steps of the machine learning process

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions