building an ETL pipeline that extracts their data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables. This will allow their analytics team to continue finding insights in what songs their users are listening to.
In this project,i build data lakes & ETL pipeline for a data lake hosted on S3. by load data from S3, process the data into analytics tables using Spark, and load them back into S3.
Replace AWS IAM Credentials in dl.cfg run python etl.py in the terminal