The project addresses a binary classification problem of song’s mode.Different approaches are proposed, both for the preprocessing step and the model selection. In particular, in the preprocessing pipeline, multi-label binarizer and tf-idf vectorizer are exploited, as well as different standardization and normalization functions. Then, some considerations are carried out on the best performing model, based on gradient boosting, which allows to reach satisfactory results.
python project.py
Report_DSL.pdf