You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The project aims to develop a real-time analysis system using Apache Kafka and Apache Spark with cloud-based architecture.The system will collect real-time data and stream the data into Kafka. Apache Spark will then be used to process and analyze the data in real-time.The processed data will be visualized using appropriate visualizations and graphs
Airflow orchestrated ETL (running in docker containers) that pulls batch data from an API to a local Postgres database, loads to AWS S3/Redshift provisioned by Terraform, and visualized in Quicksight.
This AWS-based data pipeline manages data from storage in S3 data lakes, through transformation with AWS Glue and Lambda, to refined storage in separate S3 repositories. Using Athena for SQL querying and QuickSight for interactive dashboards, this solution optimizes data processing and visualization, facilitating informed decision-making and insigh
ELT (Extract, Load, Transform) pipeline that fetches stock data from Yahoo Finance, stores it in an S3 bucket, and then loads it into an Redshift Serverless table
A Python script extracts data from Zillow and stores it in an initial S3 bucket. Then, Lambda functions handle the flow: copying the data to a processing bucket and transforming it from JSON to CSV format. The final CSV data resides in another S3 bucket, ready to be loaded into Amazon Redshift for in-depth analysis. QuickSight for visualizations
This project processes real-time cryptocurrency transactions using AWS Glue, Kinesis, DynamoDB, and Apache Hudi for ETL and analytics. The data is analyzed with Athena and visualized in QuickSight for insights.
A Python script extracts data from Zillow and stores it in an initial S3 bucket. Then, Lambda functions handle the flow: copying the data to a processing bucket and transforming it from JSON to CSV format. The final CSV data resides in another S3 bucket, ready to be loaded into Amazon Redshift for in-depth analysis. QuickSight for visualizations