Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of missings (in train + predict) #238

Closed
pat-s opened this issue Jun 6, 2019 · 1 comment
Closed

Handling of missings (in train + predict) #238

pat-s opened this issue Jun 6, 2019 · 1 comment

Comments

@pat-s
Copy link
Member

pat-s commented Jun 6, 2019

Discussed several times in mlr:

AFAICS the current behavior in mlr is the one of this PR: mlr-org/mlr#2099

@mllg
Copy link
Member

mllg commented Jun 27, 2019

Missings are now handled, with the following policy:

  • If a learner is capable of handling missing values during train(), it should get the missings property.
  • Learners which cannot handle missing values in the test set should predict NA for these observations.
  • Predicting NA results in an exception unless you have a fallback learner defined. All rows with NA observations are imputed with the predictions of the fallback learner.

I know that this is not perfect, and that there might be some rare occasions where you need more flexibility. However, I believe that this is a statistically sound approach (unlike na.rm = TRUE stuff during performance assessment). Additionally, you can always impute missing values with a PipeOp.

@mllg mllg closed this as completed Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants