Skip to content

Add weights parameter as mentioned in section 4.6 "Subsampling Training Data" #3

Open
@DavidArenburg

Description

@DavidArenburg

First of all, thanks for the great effort- it looks great. The combination of sparseMatrix with Rcpp (instead of Rs memory expensive model.matrix) looks very promising!

Though, as many times mentioned in the paper, in real world we are facing with very sparse data and very small amount of successes, hence, the data is very unbalanced. The normal logistic regression implementation can't handle this (although generating very high accuracy, no TPs will be found), hence, it is crucial to re-balance the data using some type of weights.

In section 4.6 in the paper, they introduced a pretty straight forward implementation of subsampling correction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions