Skip to content

[ML] Handle unseen categories in encoding #602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 20, 2019

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Aug 20, 2019

Previously, there was no bounds checking when looking up the frequency or target mean so we got undefined behaviour. This issue arises when (for example) predicting rows which don't have a value for the dependent variable and are thus excluded from training.

@tveasey tveasey merged commit 69702e2 into elastic:master Aug 20, 2019
@tveasey tveasey deleted the handle-unseen-categories branch August 20, 2019 17:00
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Aug 20, 2019
tveasey added a commit that referenced this pull request Aug 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants