Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log warnings for number of bins of categorical features #4448

Merged
merged 4 commits into from
Mar 27, 2022

Conversation

shiyu1994
Copy link
Collaborator

@shiyu1994 shiyu1994 commented Jul 6, 2021

This is to fix #3735, by

  1. Log warnings when number of bins of categorical features exceeds the configured maximum number of bins.
  2. When checking the bin number of feature groups in GPU versions (both GPU and CUDA) for better performance (to avoid bin numbers 17 and 65), we only consider the feature groups which contain no categorical features.

@shiyu1994
Copy link
Collaborator Author

shiyu1994 commented Jul 6, 2021

After a through thinking, I found it would be better to produce only one warning information for all categorical features which exceed the configured maximum bin number. Because it is very often that categorical features contain categories greater then max_bin. Producing one warning for each categorical feature will mess up the log output.

@jameslamb
Copy link
Collaborator

I support this PR! But don't think I'm qualified to review this code.

Could you update it to the latest master? And maybe one of the new maintainers for C++ code (@tongwu-msft or @hzy46) could help provide a review.

@StrikerRUS
Copy link
Collaborator

@guolinke Could you please help to review this PR?

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ignoring set max_bin for categorical features should log a warning to avoid confusing log messages
4 participants