In this repository, we present the solution of the team Definitive Turtles for ACM WSDM Cup 2019 Spotify Sequential Skip Prediction Challenge that reached 10th place on the final leaderboard.
You can read our solution report, Sequential skip prediction using deep learning and ensembles on the website of the WSDM19 conference.
- Python 3.5 conda environment
- Python packages: turicreate, pandas, numpy, pytorch, scikit-learn
- In our solution we heavily depend on the parallel processing power of the turicreate package.
Our solution is based on gradient boosted trees (GBT) combining several statistical features as well as the output of a recurrent neural network classifier. By using GBT, we were able to assess the importance of the various features and classifiers for skip prediction.
Please follow the steps in order if you want to reproduce our final solution:
- Preprosessing: please follow the instructions here
- Deep learning models: please follow the instructions here
- Gradient Boosted Trees: please follow the instructions here
If you have any problem with the code please contact the team members, Ferenc Béres or Domokos Kelen.