Skip to content

A Big Data and Applications course project within the curriculum of the Data Science specialization at UEH University.

Notifications You must be signed in to change notification settings

nnttvy/BigData-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project scraped YouTube's big data to analyze collected information, aiming to predict video topics and build a content recommendation system.

Data was gathered using YouTube's APIs, including video titles, descriptions, tags, view counts, likes, etc. After preprocessing steps and exploratory data analysis (EDA) to gain initial insights, my team and I employed machine learning and deep learning models, as well as natural language processing (NLP) techniques, to analyze and predict video topics. Specifically, methods such as Count Vectorizer, Word2Vec, Naive Bayes, Linear Regression, and LSTM were utilized to determine the main topic based on the corresponding titles and descriptions.

Finally, we developed a recommendation system using techniques including TF-IDF and Logistic Regression to suggest relevant videos based on user preferences and past behaviors, using collaborative filtering method.

About

A Big Data and Applications course project within the curriculum of the Data Science specialization at UEH University.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages