Skip to content

An weather ETL pipeline implemented using Apache Airflow to extract weather data from the Open-Meteo API, transform it, and load it into a PostgreSQL database for structured storage and analysis.

License

Notifications You must be signed in to change notification settings

itsabhishekm/Weather

Repository files navigation

🌦️ Weather ETL Pipeline with Apache Airflow

This project implements a weather data ETL (Extract, Transform, Load) pipeline using Apache Airflow, designed to fetch current weather data from the Open-Meteo API, transform it, and store it in a PostgreSQL database.


Features

  • Extraction: Pulls live weather data (temperature, windspeed, direction, etc.) based on geographic coordinates.
  • Transformation: Parses relevant weather metrics into structured key-value format.
  • Loading: Stores transformed data into a PostgreSQL table for persistent storage.
  • Modular Design: Built with Airflow decorators and hooks for easy extensibility.
  • Dockerized: Fully containerized with Docker and docker-compose for reproducible environments.

Tech Stack

  • Apache Airflow 2.8+
  • Docker & Docker Compose
  • PostgreSQL
  • Python 3.12
  • Open-Meteo API

Project Setup Instructions

1. Clone the Repo

git clone https://github.com/itsabhishekm/Weather.git
cd Weather

2. Configure Airflow Connections

Ensure the following Airflow connections exist:

  • open_meteo_api: HTTP connection to https://api.open-meteo.com
  • postgres_default: PostgreSQL DB connection (update airflow_settings.yaml if needed)

3. Start Airflow on your local machine by running 'astro dev start'.

This command will spin up 4 Docker containers on your machine, each for a different Airflow component:

  • Postgres: Airflow's Metadata Database
  • Webserver: The Airflow component responsible for rendering the Airflow UI
  • Scheduler: The Airflow component responsible for monitoring and triggering tasks
  • Triggerer: The Airflow component responsible for triggering deferred tasks

Verify that all 4 Docker containers were created by running 'docker ps'.

Note: Running 'astro dev start' will start your project with the Airflow Webserver exposed at port 8080 and Postgres exposed at port 5432. If you already have either of those ports allocated, you can either stop your existing Docker containers or change the port.

Access the Airflow UI for your local Airflow project. To do so, go to http://localhost:8080/ and log in with 'admin' for both your Username and Password.

About

An weather ETL pipeline implemented using Apache Airflow to extract weather data from the Open-Meteo API, transform it, and load it into a PostgreSQL database for structured storage and analysis.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published