Skip to content

LightGBM option MaximumCategoricalSplitPointCount has wrong doc #3687

@rauhs

Description

@rauhs

See:

/// <summary>
/// When the number of categories of one feature is smaller than or equal to <see cref="MaximumCategoricalSplitPointCount"/>,
/// one-vs-other split algorithm will be used.
/// </summary>
[Argument(ArgumentType.AtMostOnce, HelpText = "Max number of categorical thresholds.", ShortName = "maxcat")]
[TlcModule.Range(Inf = 0, Max = int.MaxValue)]
[TlcModule.SweepableDiscreteParam("MaxCatThreshold", new object[] { 8, 16, 32, 64 })]
public int MaximumCategoricalSplitPointCount = 32;

and

private protected static Dictionary<string, string> NameMapping = new Dictionary<string, string>()
{
{nameof(MinimumExampleCountPerLeaf), "min_data_per_leaf"},
{nameof(NumberOfLeaves), "num_leaves"},
{nameof(MaximumBinCountPerFeature), "max_bin" },
{nameof(MinimumExampleCountPerGroup), "min_data_per_group" },
{nameof(MaximumCategoricalSplitPointCount), "max_cat_threshold" },
{nameof(CategoricalSmoothing), "cat_smooth" },
{nameof(L2CategoricalRegularization), "cat_l2" },
{nameof(HandleMissingValue), "use_missing" }
};

and
https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_cat_threshold :

max_cat_threshold, default = 32, type = int, constraints: max_cat_threshold > 0
- used for the categorical features
- limit the max threshold points in categorical features

It got mixed up with:

https://lightgbm.readthedocs.io/en/latest/Parameters.html#max_cat_to_onehot

when number of categories of one feature smaller than or equal to max_cat_to_onehot, one-vs-other split algorithm will be used

Metadata

Metadata

Assignees

Labels

documentationRelated to documentation of ML.NET

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions