Classification with Pyspark

Author: Salma OUARDI

In the course of this project, an ALS (Alternating Least Squares) recommendation model was trained utilizing the MLlib library and the MovieLens 100k dataset, which was stored on the Hadoop Distributed File System (HDFS). The objective of the model was to effectively leverage the MovieLens 100k dataset to generate insightful recommendations.

This project is inspired from the book Machine Learning with Spark

Tasks / Achievements

Built a recommendation model using data about user preferences
Used the trained model to compute recommendations for a given user as well compute similar items for a given item (that is, related items)
Applied standard evaluation metrics to the model that we created to measure how well it performs in terms of predictive capability

The notebook Recommendation_system_with_Pyspark.ipynb has a full description of each step of this project.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Recommendation_System_with_Pyspark.ipynb		Recommendation_System_with_Pyspark.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classification with Pyspark

Author: Salma OUARDI

Tasks / Achievements

About

Releases

Packages

Languages

SalmaOuardi/Recommendation-Sys-with-PySpark

Folders and files

Latest commit

History

Repository files navigation

Classification with Pyspark

Author: Salma OUARDI

Tasks / Achievements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages