Real-Time Event Streaming & Change Data Capture
-
Updated
Apr 16, 2025 - Shell
Real-Time Event Streaming & Change Data Capture
Running an ETL pipeline with COBOL on Kubernetes
In the following post, we will learn how to build a data pipeline using a combination of open-source software (OSS), including Debezium, Apache Kafka, Kafka Connect.
Finnhub data streaming pipeline for real-time Bitcoin trades analysis.
Ozone Analytics provides modular data pipelines for streaming, flattening, storage, and visualization—powered by Flink, Kafka, PostgreSQL, Drill, MinIO, and Superset.
A scalable data warehouse solution designed for AI-driven traffic analytics using vehicle trajectory data from swarm UAVs. Built with Airflow for orchestration, dbt for data transformation, PostgreSQL for storage, and Redash for visualization.
Extract, Transform, and Load (ETL) processes are used when flexibility, speed, and scalability of data are crucial in an organization.
A data engineering project.
This repository includes all files that compose the design and unification of the databases AdventureWorks and WideWorldAdventure project.
Weather ETL pipeline entirely in Bash script and cron job (no manual work)
Building data warehouse from scratch using PostgreSQL as primary DBMS
ETL process which loads and transforms Medicare hospital data using Python and Hive
A simple shell script to perform a simple ETL pipeline using Linux environment bash scripting.
Этот проект реализует процесс извлечения, трансформации и загрузки (ETL).
Built an ETL Pipeline that extract Climate data from API and transform the data by combining all data extracted from API into a single file which is then loaded into an output folder
Docker를 사용하여 Hadoop 생태계의 구성 요소와 기타 필수 서비스를 컨테이너화하여 강력한 데이터 엔지니어링 환경을 설정하는 방법을 보여줍니다. 설정에는 Hadoop (HDFS, YARN), Apache Hive, PostgreSQL 및 Apache Airflow가 포함되며, 이들 모두가 원활하게 작동하도록 구성되어 있습니다.
A shell script ensures that recent updates to sensitive files are regularly backed up, enhancing data security and reducing manual effort.
A robust real-time data streaming pipeline using Apache Kafka for event ingestion, Apache Flink for real-time processing, and PostgreSQL for storage and analytics. Designed for low-latency insights and scalable data workflows.
Using large language models and AWS Bedrock to orchestrate an ETL pipeline
Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."