This project implements a weather data ETL (Extract, Transform, Load) pipeline using Apache Airflow, designed to fetch current weather data from the Open-Meteo API, transform it, and store it in a PostgreSQL database.
- Extraction: Pulls live weather data (temperature, windspeed, direction, etc.) based on geographic coordinates.
- Transformation: Parses relevant weather metrics into structured key-value format.
- Loading: Stores transformed data into a PostgreSQL table for persistent storage.
- Modular Design: Built with Airflow decorators and hooks for easy extensibility.
- Dockerized: Fully containerized with
Docker
anddocker-compose
for reproducible environments.
- Apache Airflow 2.8+
- Docker & Docker Compose
- PostgreSQL
- Python 3.12
- Open-Meteo API
git clone https://github.com/itsabhishekm/Weather.git
cd Weather
Ensure the following Airflow connections exist:
open_meteo_api
: HTTP connection to https://api.open-meteo.compostgres_default
: PostgreSQL DB connection (updateairflow_settings.yaml
if needed)
This command will spin up 4 Docker containers on your machine, each for a different Airflow component:
- Postgres: Airflow's Metadata Database
- Webserver: The Airflow component responsible for rendering the Airflow UI
- Scheduler: The Airflow component responsible for monitoring and triggering tasks
- Triggerer: The Airflow component responsible for triggering deferred tasks
Verify that all 4 Docker containers were created by running 'docker ps'.
Note: Running 'astro dev start' will start your project with the Airflow Webserver exposed at port 8080 and Postgres exposed at port 5432. If you already have either of those ports allocated, you can either stop your existing Docker containers or change the port.
Access the Airflow UI for your local Airflow project. To do so, go to http://localhost:8080/ and log in with 'admin' for both your Username and Password.