Spike and slab feature sampling priors (feature weighted sampling)

Hi @guolinke ,

being thinking about the `feature_bagging`and crossing myself information with permutation importances and other tools related to feature selection on decision tree ensemble models, I have reached an idea about a possible feature on `LightGBM`.

Threre could set the posibility of **performing the feature bagging according to assigned probabilities of inclusion for each feature** on each boosting iteration or each randomforest tree. 

It would work as some kind of a *proxy* to [*spike-and-slab*](https://en.wikipedia.org/wiki/Spike-and-slab_regression) technique. Do not know if it would improve the full random option, as it would also produce trees with different number of features used in each iteration, but it would include more the *a priori* best variables.

The main idea is to run LightGBM, do permutation importances, then set spike-and-slab probability of inclusions based on this permutation importances, then run again LightGBM with those feature sampling prior probabilities.

Maybe I am asking for an already existing feature as `feature_weight`, but I have not managed to find it, excuse me in this case.

Regards.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike and slab feature sampling priors (feature weighted sampling) #2542

Guillermogsjc
openedon Nov 5, 2019

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Spike and slab feature sampling priors (feature weighted sampling) #2542

Description

Guillermogsjcopenedon Nov 5, 2019

Metadata