Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing prediction methods to be more standardized #167

Merged
merged 16 commits into from
Oct 12, 2015

Conversation

jknowles
Copy link
Contributor

This PR addresses #165 and #158 by making sure that predict.caretEnsemble and predict.caretStack provide similar output and have similar inputs. There are also large changes to the unit tests to make sure that the tests now test the correct structure for the output from these two functions.

predict.caretStack is particularly heavily modified here -- it now can produce prediction errors based on the model disagreement about each observation, mimicking the behavior of predict.caretEnsemble. It uses the variable importance of each models' predictions in the stacked model to define the weights. When these predictions are transformed before ensembling, no standard error is returned or calculated, but the variable importance can still be returned as a weights attribute to the outputted vector.

@jknowles
Copy link
Contributor Author

@zachmayer This is a bit of a large PR, so just ping me with a comment when you have a chance to review it. I can answer any questions you have hopefully and am also willing to make whatever changes you think make sense.

@jknowles
Copy link
Contributor Author

I'd also recommend changing the Git Check settings to only pick up git checks on the "NOT_CRAN" version of caretEnsemble because our testing infrastructure is too big to get full coverage within the CRAN test limits.

@jknowles
Copy link
Contributor Author

jknowles commented Oct 9, 2015

@zachmayer Just a ping to see about moving this forward. I think it'll solve some of the recently reported issues like #172 .

@zachmayer
Copy link
Owner

This file has a couple untested lines: https://coveralls.io/builds/3539104/source?filename=R%2FcaretEnsemble.R

Could you add one or 2 more tests to hit the red lines? Meanwhile, I've gotta go figure out why our travis builds are failing.

args <- lapply(args, eval, parent.frame()) # convert from symbols to objects
if(exists("newdata", args)){
# tmp <- args$newdata
if(anyNA(args$newdata)){
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this line means that we will never get to the "Missing data found..." keepNA argument below, but I can't be sure because the data could be complete, but for some functions it could still produce an NA prediction. This seems pretty difficult to identify a test case for.

@zachmayer
Copy link
Owner

Fixed the build issue, wooooooo

@zachmayer
Copy link
Owner

Also, instead of merging master, you can also rebase off master and then force push. It's a little bit harder to do, but makes for a slightly nicer commit history.

@jknowles
Copy link
Contributor Author

jknowles commented Oct 9, 2015

OK, I think I did the rebase, but I'm not sure. At any rate we're even with master again. I think what needs to happen is I need to properly deprecate keepNA because I don't think it ever comes up anymore. Does that seem OK? Was going to use Hadley's trick here:

http://r-pkgs.had.co.nz/release.html

@zachmayer
Copy link
Owner

👍 from me. I'm re-testing right now, and will merge when all the tests pass

@jknowles
Copy link
Contributor Author

jknowles commented Oct 9, 2015

OK one last push to test my deprecation of keepNA.

@jknowles
Copy link
Contributor Author

jknowles commented Oct 9, 2015

@zachmayer It builds, but the coverage check is a little wonky. I can't get those deprecated bits to be tested, but I don't want to junk them because legacy code may use them (I'm speaking selfishly here). If we merge this, I'm happy to figure out the merge conflicts it might cause with the other PR.

@zachmayer
Copy link
Owner

I edited the thresholds and re-ran

@jknowles
Copy link
Contributor Author

@zachmayer Looks like we're good to go. Thanks! After this gets merged in, I'll re-base and start tackling the two issues that cropped up related to prediction #172 and #171. This closes #167.

@zachmayer
Copy link
Owner

Awesome!  I'm away for the weekend but will merge asap so we can move on!

zachmayer added a commit that referenced this pull request Oct 12, 2015
Fixing prediction methods to be more standardized
@zachmayer zachmayer merged commit 308fa6e into zachmayer:master Oct 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants