Recommendation Systems: Ending Computer Engineering Undergrad Project

Theorical Project in the PDF file:

Latex project made on Overleaf, with it's pdf and images available in this repository:
Overleaf: https://overleaf.com
PDF: On the root folder or in
ArXiv:
Images: On the "Images" folder

Practical Project:

Installing the Venv:

I called venv, and the sequence is in the requirements.txt, suposing the python venv is already installed in your device you will:
Create the venv called "firstEnv":
python3 -m venv firstEnv
Activate it:
source firstEnv/bin/activate
Install jupyter notebook that we'll need to make the work presentable.
pip3 install jupyter
TensorFlow version 1, for all the machine learning stuff
pip3 install tensorflow
Keras to make AI models even easier to implement
pip3 install keras
MatPlotLib for our data visualization
pip3 install matplotlib
TQDM for ProgressBars in out jupyter notebook or terminal
pip3 install tqdm
Scikit-Learn for it's extremely efficient implementations of a few famous algorithms
pip3 install scikit-learn
Pandas for all our dataframes and CSV manipulation
python3 -m pip install --upgrade pandas
Surprise for some already done recommender algorithms to compare
pip3 install surprise
Numpy and Scipy for dealing with large arrays
pip3 install numpy
pip3 install scipy

Versions that I used:

Python: 3.7.3
TensorFlow: 1.14.0
Keras: 2.2.4
MatplotLib: 3.1.1
TQDM: 4.36.1
Scikit-Learn: 0.21.3
Surprise: 0.1
Scipy: 1.3.0
Pandas: 0.25.1
Jupyter Notebook: 1.0.0
Markdown: 3.1.1
Numpy: 1.16.4
Pip: 19.3

Folders Structure Explained:

ML_Dataset Folder: Contains the Movie Lens Smallest and 27M datasets, it's on .gitignore because all files beyond the original dataset is generated by executing the scripts on this project.
first_env Folder: Also on .gitignore, it's basically the python virtual enviroment that we are going to install all we need. I caled first because, maybe, in the future, gonna have another ones to try different combinations of libraries like TensorFlow 2.0, Seaborn instead of Malplotlib. But this still not happened
images Folder: Has the images used in the theorical and the experimental section of the article.
main Folder: Literally the main, all the python scripts are here. Divided in the following subfolders: -- dataPrep folder: The data preparation scripts, this includes the loading of datasets, the long tail crop, the dimensionality reduction and clusterization.
- longTail_crop.py: It will crop the movies that are too little rated, and the user that evaluated too little movies, to make our data less noisy.
- profiles.py: Creates the movie_profile.csv and user_profile.csv the caractheristic vector that define them, in this recommender systems.
- pca.py: Will reduce dimensionality in this profiles, or try, if reduces in a significant manner will rewrite the old dataset.
- HDBSCAN_applied.py: Responsible for running the cluster algorithm over the movies dataset, and creating the file: movies_cluster.csv, that wasn't created due to impossibility to cluster the majority of users, and a good chunck of movies.
- data_split.py: This one will just load the original csv's into more strategic ones, a training and a test set called training_movies.csv and test_movies.csv
- correct_dataset.py: excess column remover from datasets -- recommenders folder: The Surprise Library aplication, each file is very similar, some contains grid search like KNN Basic and SVD, all the others are execution with 5-fold cross validations, each has the algoritm name. -- results: Includes the results from almost all algorithm executions, has the 20K predictions of several algorithms, has the results from GRID Search in KNNBasic_results and SVD_results, has HBDSCAN Fit Results from users and movies, and the PCA results.
Execution Order:
1. longTailCrop
2. profiles
3. correct_dataset
4. PCA
5. HDBSCAN_Applied
6. data_split
7. KNNBasic
8. SVD
9. KNNMeans
10. KNNBaseline
11. KNNZScore
12. Baseline
13. CoClustering
14. SlopeOne
15. NormalPred
16. NMF
17. SVDpp
18. ploting

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
images		images
main		main
old_stuff		old_stuff
.gitignore		.gitignore
PFG Slides.pdf		PFG Slides.pdf
PFG_Sitemas_Recomendacao.pdf		PFG_Sitemas_Recomendacao.pdf
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommendation Systems: Ending Computer Engineering Undergrad Project

Theorical Project in the PDF file:

Practical Project:

Installing the Venv:

Versions that I used:

Folders Structure Explained:

Execution Order:

About

Releases

Packages

Languages

raissaccorreia/movieRecommender

Folders and files

Latest commit

History

Repository files navigation

Recommendation Systems: Ending Computer Engineering Undergrad Project

Theorical Project in the PDF file:

Practical Project:

Installing the Venv:

Versions that I used:

Folders Structure Explained:

Execution Order:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages