NBA ETL Project

Overview

This NBA ETL (Extract, Transform, Load) project automates the collection, processing, and storage of NBA data for analysis. The project showcases skills in integrating data sources, efficient processing, and scalable architecture, making it ideal for sports analytics and data engineering.

Key Objectives:

Automate real-time NBA data collection.
Enable data-driven decision-making with clean datasets.
Showcase advanced data engineering skills.

Features

Data Extraction: Collects NBA data, including team rosters, player profiles, draft history, and game box scores.
Data Transformation: Cleanses and standardizes data for consistency and insight generation.
Data Loading: Loads processed data into a database for easy access and analysis.

Project Structure

DAGs: Contains Airflow DAGs for scheduling ETL processes.
Scripts: Python scripts for extracting and processing datasets.
Docker: Docker configurations for reproducible environments.
Database: Schema definitions and SQL scripts for setting up the data warehouse.

Installation and Setup

Prerequisites:

Python 3.x
Docker
PostgreSQL (or another relational database)

Steps:

Clone the repository:

git clone https://github.com/joshc3453/NBA_ETL_Project.git

Install dependencies:
```
pip install -r requirements.txt
```
Set up Docker containers:
```
docker-compose up
```
Initialize the database:
- Run SQL scripts to create necessary tables and schemas.
- Alternatively, use Docker to set up the database automatically.
Run the ETL pipeline:
- Trigger the Airflow DAGs to start the ETL process.
- Monitor workflows via the Airflow web interface.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
dags		dags
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
draft_history.py		draft_history.py
initial_boxscore_load.py		initial_boxscore_load.py
players.py		players.py
requirements.txt		requirements.txt
team_rosters.py		team_rosters.py
teams.py		teams.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NBA ETL Project

Overview

Key Objectives:

Features

Project Structure

Installation and Setup

Prerequisites:

Steps:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

joshc3453/NBA_ETL_Project

Folders and files

Latest commit

History

Repository files navigation

NBA ETL Project

Overview

Key Objectives:

Features

Project Structure

Installation and Setup

Prerequisites:

Steps:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages