Open
Description
There are a couple of features/properties about Abstrackr that I was unable to find in literature and/or documentation:
- In this snippet of (outdated?) source code it seems like TF-IDF is used for feature extraction, but I could not find any document where this is mentioned explicitly.
- No balancing strategy is mentioned, is any strategy used?
- What is the minimal training data size (provide a number for Relevant and Irrelevant records) for the model to start training?