Skip to content

Spike and slab feature sampling priors (feature weighted sampling) #2542

Closed

Description

Hi @guolinke ,

being thinking about the feature_baggingand crossing myself information with permutation importances and other tools related to feature selection on decision tree ensemble models, I have reached an idea about a possible feature on LightGBM.

Threre could set the posibility of performing the feature bagging according to assigned probabilities of inclusion for each feature on each boosting iteration or each randomforest tree.

It would work as some kind of a proxy to spike-and-slab technique. Do not know if it would improve the full random option, as it would also produce trees with different number of features used in each iteration, but it would include more the a priori best variables.

The main idea is to run LightGBM, do permutation importances, then set spike-and-slab probability of inclusions based on this permutation importances, then run again LightGBM with those feature sampling prior probabilities.

Maybe I am asking for an already existing feature as feature_weight, but I have not managed to find it, excuse me in this case.

Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions