Description
During the course of Fixing #2530 in PR #2650, I discovered that when creating a LightGbmRankingTrainer
with MLContext.Ranking.Trainers.LightGbm(Options options)
, the GroupId column is never set even though it is specified in the Options
.
Digging a bit deeper, I found that in LightGbmTrainerBase.cs we have
private protected LightGbmTrainerBase(IHostEnvironment env, string name, Options options, SchemaShape.Column label)
: base(Contracts.CheckRef(env, nameof(env)).Register(name), TrainerUtils.MakeR4VecFeature(options.FeatureColumn), label, TrainerUtils.MakeR4ScalarWeightColumn(options.WeightColumn))
This does not use options.GroupIdColumn to create a SchemaShape.Column
for GroupId to pass to the base constructor TrainerEstimatorBaseWithGroupId
. Neither is there a method in TrainerUtils
to create such a SchemaShape.Column
object from options.GroupIdColumn for Key types such as GroupId.
Repro:
Run the sample in Microsoft.ML.Samples.Dynamic.LightGbmRankingWithOptions.Example()
.
Unhandled Exception: System.ArgumentOutOfRangeException: Need a group column.
Parameter name: data
at Microsoft.ML.Contracts.CheckParam(IExceptionContext ctx, Boolean f, String paramName, String msg) in C:\najeeb-kazmi\machinelearning\src\Microsoft.ML.Core\Utilities\Contracts.cs:line 543
at Microsoft.ML.LightGBM.LightGbmRankingTrainer.CheckDataValid(IChannel ch, RoleMappedData data) in C:\najeeb-kazmi\machinelearning\src\Microsoft.ML.LightGBM\LightGbmRankingTrainer.cs:line 127
at Microsoft.ML.LightGBM.LightGbmTrainerBase3.LoadTrainingData(IChannel ch, RoleMappedData trainData, CategoricalMetaData& catMetaData) in C:\najeeb-kazmi\machinelearning\src\Microsoft.ML.LightGBM\LightGbmTrainerBase.cs:line 313 at Microsoft.ML.LightGBM.LightGbmTrainerBase
3.TrainModelCore(TrainContext context) in C:\najeeb-kazmi\machinelearning\src\Microsoft.ML.LightGBM\LightGbmTrainerBase.cs:line 113
at Microsoft.ML.Training.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) in C:\najeeb-kazmi\machinelearning\src\Microsoft.ML.Data\Training\TrainerEstimatorBase.cs:line 148 at Microsoft.ML.Training.TrainerEstimatorBase
2.Fit(IDataView input) in C:\najeeb-kazmi\machinelearning\src\Microsoft.ML.Data\Training\TrainerEstimatorBase.cs:line 75
at Microsoft.ML.Samples.Dynamic.LightGbmRankingWithOptions.Example() in C:\najeeb-kazmi\machinelearning\docs\samples\Microsoft.ML.Samples\Dynamic\Trainers\Ranking\LightGBMRankingWithOptions.cs:line 31
at Microsoft.ML.Samples.Program.Main(String[] args) in C:\najeeb-kazmi\machinelearning\docs\samples\Microsoft.ML.Samples\Program.cs:line 9