Project for CS267 Topics in Database Systems
- Design a movie recommendation system that considers the past movie ratings given by various users to provide suggestions to the user.
- Implement this system using collaborative filtering algorithms and Apache Mahout framework.
- Dataset is obtained from the Yahoo Research Webscope database. The database provides sevral files, two of which are used for this project, namely the Yahoo! Movies User Ratings and Yahoo! Descriptive Content Information, v.1.0.
- The Yahoo! Movies Users Ratings file contains 211231 records and consists of User ID, Movie ID and Ratings.
- The Yahoo! Movies Descriptive Content Information file contains 54058 records and consists Movie ID, Title, Genre, Directors, Actors and so on.
- Java
- Python
- Apache Mahout Framework
- HDFS
- Python Pandas, Matplotlib libraries
- OpenRefine
- Cloudera Quickstart VM
Note: You can view .ipynb files using nbviewer - https://nbviewer.jupyter.org/