intelligent SQL query router with cost-based optimization, automatic backend selection (DuckDB/Polars/Spark), and partition pruning for 50-100x speedups
-
Updated
Nov 24, 2025 - Python
intelligent SQL query router with cost-based optimization, automatic backend selection (DuckDB/Polars/Spark), and partition pruning for 50-100x speedups
🪂 Parachute: Single-Pass Bi-Directional Information Passing (VLDB'25)
PySpark ETL & analytics pipeline for taxi trip ETA, partitioned Parquet, windowed aggregations and performance patterns.
🚖 Ingest and analyze NYC yellow taxi data with a streamlined ETL pipeline, featuring data cleaning, analytics, and business-ready outputs.
Add a description, image, and links to the partition-pruning topic page so that developers can more easily learn about it.
To associate your repository with the partition-pruning topic, visit your repo's landing page and select "manage topics."