Skip to content

LightGBM trainer exception #1625

Closed
Closed
@daholste

Description

@daholste

System information

  • OS version/distro: Windows 10
  • .NET Version (eg., dotnet --info): .NET Core 2.1

Issue

  • What did you do?
    Ran MML command line: execgraph "C:\Benchmarking\automl_graph.json"

Contents of automl_.graph.json:

{
  "Inputs": {
    "file_train": "D:\\SplitDatasets\\ExcitementFG2_train.csv",
    "file_test": "D:\\SplitDatasets\\ExcitementFG2_valid.csv"
  },
  "Nodes": [
    {
      "Inputs": {
        "CustomSchema": "sep=, col=Label:R4:78 col=Features1:R4:0-77 col=Features2:R4:79-202 header=+",
        "InputFile": "$file_train"
      },
      "Name": "Data.CustomTextLoader",
      "Outputs": {
        "Data": "$data_train"
      }
    },
    {
      "Inputs": {
        "CustomSchema": "sep=, col=Label:R4:78 col=Features1:R4:0-77 col=Features2:R4:79-202 header=+",
        "InputFile": "$file_test"
      },
      "Name": "Data.CustomTextLoader",
      "Outputs": {
        "Data": "$data_test"
      }
    },
    {
      "Inputs": {
        "BatchSize": 3,
        "StateArguments": {
          "Name": "AutoMlState",
          "Settings": {
            "Engine": {
              "Name": "Rocket",
              "Settings": {}
            },
            "Metric": "Accuracy",
            "TerminatorArgs": {
              "Name": "IterationLimited",
              "Settings": {
                "FinalHistoryLength": 100
              }
            },
            "TrainerKind": "SignatureBinaryClassifierTrainer"
          }
        },
        "TestingData": "$data_test",
        "TrainingData": "$data_train",
		"IgnoreColumns": ["cost"]
      },
      "Name": "Models.PipelineSweeper",
      "Outputs": {
        "Results": "$output_data",
        "State": "$xyz"
      }
    }
  ],
  "Outputs": {
    "output_data": "C:\\Benchmarking\\01-ResultsOut.csv"
  }
}
  • What happened?
    Encountered an exception in LightGBM trainer

  • What did you expect?
    A run to completion, w/o exception

Source code / logs

--- Command line args ---
dotnet MML.dll execgraph C:\Benchmarking\automl_graph.json

--- Exception message ---

System.InvalidOperationException
  HResult=0x80131509
  Message=Categorical split features is zero length
  Source=Microsoft.ML.Core
  StackTrace:
   at Microsoft.ML.Runtime.Contracts.Check(Boolean f, String msg) in C:\MLDotNet\src\Microsoft.ML.Core\Utilities\Contracts.cs:line 497
   at Microsoft.ML.Trainers.FastTree.Internal.RegressionTree.CheckValid(Action`2 checker) in C:\MLDotNet\src\Microsoft.ML.FastTree\TreeEnsemble\RegressionTree.cs:line 469
   at Microsoft.ML.Trainers.FastTree.Internal.RegressionTree..ctor(Int32[] splitFeatures, Double[] splitGain, Double[] gainPValue, Single[] rawThresholds, Single[] defaultValueForMissing, Int32[] lteChild, Int32[] gtChild, Double[] leafValues, Int32[][] categoricalSplitFeatures, Boolean[] categoricalSplit) in C:\MLDotNet\src\Microsoft.ML.FastTree\TreeEnsemble\RegressionTree.cs:line 223
   at Microsoft.ML.Trainers.FastTree.Internal.RegressionTree.Create(Int32 numLeaves, Int32[] splitFeatures, Double[] splitGain, Single[] rawThresholds, Single[] defaultValueForMissing, Int32[] lteChild, Int32[] gtChild, Double[] leafValues, Int32[][] categoricalSplitFeatures, Boolean[] categoricalSplit) in C:\MLDotNet\src\Microsoft.ML.FastTree\TreeEnsemble\RegressionTree.cs:line 189
   at Microsoft.ML.Runtime.LightGBM.Booster.GetModel(Int32[] categoricalFeatureBoudaries) in C:\MLDotNet\src\Microsoft.ML.LightGBM\WrappedLightGbmBooster.cs:line 241
   at Microsoft.ML.Runtime.LightGBM.LightGbmTrainerBase`3.TrainCore(IChannel ch, IProgressChannel pch, Dataset dtrain, CategoricalMetaData catMetaData, Dataset dvalid) in C:\MLDotNet\src\Microsoft.ML.LightGBM\LightGbmTrainerBase.cs:line 378
   at Microsoft.ML.Runtime.LightGBM.LightGbmTrainerBase`3.TrainModelCore(TrainContext context) in C:\MLDotNet\src\Microsoft.ML.LightGBM\LightGbmTrainerBase.cs:line 126
   at Microsoft.ML.Runtime.Training.TrainerEstimatorBase`2.Train(TrainContext context) in C:\MLDotNet\src\Microsoft.ML.Data\Training\TrainerEstimatorBase.cs:line 92
   at Microsoft.ML.Runtime.Training.TrainerEstimatorBase`2.Microsoft.ML.Runtime.ITrainer.Train(TrainContext context) in C:\MLDotNet\src\Microsoft.ML.Data\Training\TrainerEstimatorBase.cs:line 158
   at Microsoft.ML.Runtime.Data.TrainUtils.TrainCore(IHostEnvironment env, IChannel ch, RoleMappedData data, ITrainer trainer, RoleMappedData validData, IComponentFactory`1 calibrator, Int32 maxCalibrationExamples, Nullable`1 cacheData, IPredictor inputPredictor) in C:\MLDotNet\src\Microsoft.ML.Data\Commands\TrainCommand.cs:line 254
   at Microsoft.ML.Runtime.Data.TrainUtils.Train(IHostEnvironment env, IChannel ch, RoleMappedData data, ITrainer trainer, IComponentFactory`1 calibrator, Int32 maxCalibrationExamples) in C:\MLDotNet\src\Microsoft.ML.Data\Commands\TrainCommand.cs:line 223
   at Microsoft.ML.Runtime.EntryPoints.LearnerEntryPointsUtils.Train[TArg,TOut](IHost host, TArg input, Func`1 createTrainer, Func`1 getLabel, Func`1 getWeight, Func`1 getGroup, Func`1 getName, Func`1 getCustom, ICalibratorTrainerFactory calibrator, Int32 maxCalibrationExamples) in C:\MLDotNet\src\Microsoft.ML.Data\EntryPoints\InputBase.cs:line 189
   at Microsoft.ML.Runtime.LightGBM.LightGbm.TrainBinary(IHostEnvironment env, LightGbmArguments input) in C:\MLDotNet\src\Microsoft.ML.LightGBM\LightGbmBinaryTrainer.cs:line 189

Metadata

Metadata

Assignees

Labels

P0Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions