Open source stack lakehouse
-
Updated
Mar 2, 2024 - Python
Open source stack lakehouse
Command line interface to the services provided by Oslo Origo's Dataplatform
Data platform to build batch and real-time ETL flows using only open source technologies.
How to build a complete Data Platform -> Here
A complete Open-Source Data Platform with ETL, Datawarehouse and Viz
Dataplatform hosted events
Big Data Platform on MongoDB Atlas and Heroku PostgreSQL
This project goal is to design a Data Platform for retail Data Analytics.
AWS Lambda function for generating presigned URLs and form fields used to upload files to S3
This ETL project was designed to demonstrate the development of a scalable data pipeline for customer sales analysis. It covers all essential steps, from data extraction to transformation and loading into a database, with Apache Airflow used.
Yet Another Data Platform - Kubernetes based data platform composed from open source components
divith-raju-big-data-tools
REST API for creating and managing event streams and sending data to event streams. Obsolete as of 2022-05-20.
The Spark Memory Configuration Calculator is designed to help data engineers and Spark developers quickly determine the optimal memory and core configurations for their Spark clusters. With this tool, you can avoid common pitfalls and ensure your cluster resources are used efficiently, leading to better performance and lower costs.
Add a description, image, and links to the dataplatform topic page so that developers can more easily learn about it.
To associate your repository with the dataplatform topic, visit your repo's landing page and select "manage topics."