Skip to content

[ML] Add information about samples per node to the tree #850

Closed
@valeriy42

Description

@valeriy42

I propose to extend the node definition by adding information about the number of training samples that passed through the node. This information provide one of the simpler feature importance indicators (see xgboost cover). It also helps to test and prevent degenerated trees such as those described in #849 .

Sub-Tasks

  • Update schema definition
  • Update Node logic
  • Update inference model code
  • Update persist/restore schema version
  • Add enhancement note

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions