Compare tables within or across databases
-
Updated
May 17, 2024 - Python
Compare tables within or across databases
Scalable and efficient data transformation framework - backwards compatible with dbt.
Code and data for the Modern Polars book
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
A Data Platform built for AWS, powered by Kubernetes.
Simple stream processing pipeline
Data Engineering/Scraping Project. Creating a detailed Sports Relational Database for the Top European Soccer Leagues.
Found a data engineering challenge or participated in a selection process ? Share with us!
Data engineering interviews Q&A for data community by data community
Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market
Build, test, deploy, iterate - Dev and prod tool for data science pipelines
Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.
Project for "Data pipeline design patterns" blog.
This repo demonstrates the development of a real-time data pipeline designed to ingest, process, and analyze stock market data. Using cutting-edge tools like Apache Kafka, PostgreSQL, and Python, the pipeline captures stock data in real-time and stores it in a robust data architecture, enabling timely analysis and insights.
This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.
Datu Core AI Analyst open-source
An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reliability. Uniquely, these pipelines are field-tested on farms across Sumatra, Indonesia, ensuring real-world applicability and resilience.
kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.
Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.
To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."