Starred repositories
A list of Free Software network services and web applications which can be hosted on your own servers
🧙 Build, run, and manage data pipelines for integrating and transforming data.
🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Create code snippets, browse AI prompts, create extension icons and more.
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
An Iot Application using Spark, DeltaLake, Hive, Minio, Presto, Superset and Airflow.
Download daily forex rates and append to Hive table
A Data Engineering & Machine Learning Knowledge Hub
🖖 A vue-cli 3.0 + typescript minimal admin template
lightweight, standalone C++ inference engine for Google's Gemma models.
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
📙 Awesome Data Catalogs and Observability Platforms.
The Metadata Platform for your Data and AI Stack
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Source code for Twitter's Recommendation Algorithm
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Open-source keyboard firmware for Atmel AVR and Arm USB families
Free Data Engineering course!
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
DuckDB is an analytical in-process SQL database management system
🚀Memory safe, blazing fast, configurable, minimal hello world written in rust(🚀) in a few lines of code with few(1092🚀) dependencies🚀
Docker with Airflow and Spark standalone cluster
Empowering everyone to build reliable and efficient software.
This repository started out as a learning in public project for myself and has now become a structured learning map for many in the community. We have 3 years under our belt covering all things Dev…