Extract, Transfrom, Load Movie Data

Overview

The purpose of this project was to gather movie data from both wikipedia and kaggle and create a database that students can use to preform there own analysis. In order to do these, I extracted the neccessary data from Wikipedia and Kaggle, used python's pandas library to transform the data into a working dataframe, and then loaded it to PostgreSQL for students to use it.

Results

After completing the extract, transform, and load process, I had a database with two tables: one for movies and one for ratings. The movies table consists of 31 columns holding a range of information for 6048 different movies.

The ratings table consists of 5 different columns holding a range of information on over 26 million individual user ratings.

Together, students will have a chance to preform various different analysis from these tables to derive all sorts of interesting insights.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
resources		resources
.gitattributes		.gitattributes
.gitignore		.gitignore
ETL_clean_kaggle_data.ipynb		ETL_clean_kaggle_data.ipynb
ETL_clean_wiki_movies.ipynb		ETL_clean_wiki_movies.ipynb
ETL_create_database.ipynb		ETL_create_database.ipynb
ETL_function_test.ipynb		ETL_function_test.ipynb
README.md		README.md
movie_data.ipynb		movie_data.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extract, Transfrom, Load Movie Data

Overview

Results

About

Releases

Packages

Languages

Wall-E28/movies_ETL

Folders and files

Latest commit

History

Repository files navigation

Extract, Transfrom, Load Movie Data

Overview

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages