open-table-format

Here are 9 public repositories matching this topic...

victorskl / genomic-bigdata-spark

Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture

bioinformatics spark bigdata cloud-computing datawarehousing parquet cloudnative datalake spark-sql genomics-data delta-lake lakehouse open-table-format

Updated Jan 19, 2023
Jupyter Notebook

mrutunjay-kinagi / ragsearch

Star

ragsearch is a Python library designed for building a Retrieval-Augmented Generation (RAG) application that enables natural language querying over both structured and unstructured data. This tool leverages embedding models and a vector database (FAISS or ChromaDB) to provide an efficient and scalable search engine.

opensource ai generator retrieval augument rag vector-database llms generative-ai genai retrieval-augmented-generation ragsearch open-table-format

Updated Apr 23, 2026
Python

guptaakashdeep / spark-minio-project

Star

Builds a Spark Standalone Cluster on Docker in local with MinIO integration

apache-spark minio open-table-format

Updated Dec 22, 2024
Jupyter Notebook

victorskl / iceberg-tute

Star

Quick look into Iceberg Table that underpin Iceberg Data Lake

open-table-format

Updated Aug 3, 2024
Jupyter Notebook

victorskl / deltalake-tute

Star

Quick look into Delta Table that underpin Delta Lake

open-table-format

Updated Aug 3, 2024
Jupyter Notebook

venkatchittoor / iceberg-lakehouse-comparison

Star

Apache Iceberg vs Delta Lake — same Medallion pipeline built on two different table formats

spark comparison pyspark data-engineering parquet schema-evolution partitioning time-travel apache-iceberg delta-lake lakehouse data-lakehouse medallion-architecture open-table-format

Updated Apr 21, 2026
Python

jiatangzhi / master_thesis

Star

This project implements my master’s thesis on building a scalable, ACID-compliant data lakehouse architecture for IoT and industrial workloads, in a AWS-native environment.