See:
|
/// <summary> |
|
/// When the number of categories of one feature is smaller than or equal to <see cref="MaximumCategoricalSplitPointCount"/>, |
|
/// one-vs-other split algorithm will be used. |
|
/// </summary> |
|
[Argument(ArgumentType.AtMostOnce, HelpText = "Max number of categorical thresholds.", ShortName = "maxcat")] |
|
[TlcModule.Range(Inf = 0, Max = int.MaxValue)] |
|
[TlcModule.SweepableDiscreteParam("MaxCatThreshold", new object[] { 8, 16, 32, 64 })] |
|
public int MaximumCategoricalSplitPointCount = 32; |
and
|
private protected static Dictionary<string, string> NameMapping = new Dictionary<string, string>() |
|
{ |
|
{nameof(MinimumExampleCountPerLeaf), "min_data_per_leaf"}, |
|
{nameof(NumberOfLeaves), "num_leaves"}, |
|
{nameof(MaximumBinCountPerFeature), "max_bin" }, |
|
{nameof(MinimumExampleCountPerGroup), "min_data_per_group" }, |
|
{nameof(MaximumCategoricalSplitPointCount), "max_cat_threshold" }, |
|
{nameof(CategoricalSmoothing), "cat_smooth" }, |
|
{nameof(L2CategoricalRegularization), "cat_l2" }, |
|
{nameof(HandleMissingValue), "use_missing" } |
|
}; |
and
https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_cat_threshold :
max_cat_threshold, default = 32, type = int, constraints: max_cat_threshold > 0
- used for the categorical features
- limit the max threshold points in categorical features
It got mixed up with:
https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_cat_to_onehot
when number of categories of one feature smaller than or equal to max_cat_to_onehot, one-vs-other split algorithm will be used
See:
machinelearning/src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs
Lines 189 to 196 in 8b1b14f
and
machinelearning/src/Microsoft.ML.LightGbm/LightGbmTrainerBase.cs
Lines 49 to 59 in 8b1b14f
and
https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_cat_threshold :
It got mixed up with:
https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_cat_to_onehot