Open
Description
First of all, thanks for the great effort- it looks great. The combination of sparseMatrix
with Rcpp (instead of Rs memory expensive model.matrix
) looks very promising!
Though, as many times mentioned in the paper, in real world we are facing with very sparse data and very small amount of successes, hence, the data is very unbalanced. The normal logistic regression implementation can't handle this (although generating very high accuracy, no TPs will be found), hence, it is crucial to re-balance the data using some type of weights.
In section 4.6 in the paper, they introduced a pretty straight forward implementation of subsampling correction.
Metadata
Metadata
Assignees
Labels
No labels