Skip to content
Change the repository type filter

All

    Repositories list

    • Daft

      Public
      Distributed data engine for Python/SQL designed for the cloud, powered by Rust
      Rust
      Apache License 2.0
      1452.2k18647Updated Oct 5, 2024Oct 5, 2024
    • Daft landing page
      2020Updated Oct 4, 2024Oct 4, 2024
    • A simple launcher for spinning up and managing Ray clusters for the Daft Query Engine.
      Python
      Apache License 2.0
      0310Updated Oct 4, 2024Oct 4, 2024
    • HTML
      0000Updated Sep 29, 2024Sep 29, 2024
    • JavaScript
      0000Updated Aug 15, 2024Aug 15, 2024
    • Building a simple Multimodal Data Warehouse: workflows to ingest, analyze, process & train models on multimodal data
      Jupyter Notebook
      Apache License 2.0
      0000Updated Jul 23, 2024Jul 23, 2024
    • Open, Multi-modal Catalog for Data & AI
      Java
      Apache License 2.0
      360000Updated Jun 14, 2024Jun 14, 2024
    • parquet2

      Public
      Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow
      Rust
      Other
      59002Updated May 31, 2024May 31, 2024
    • deltacat

      Public
      A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.
      Python
      Apache License 2.0
      22000Updated May 2, 2024May 2, 2024
    • arrow2

      Public
      Transmute-free Rust library to work with the Arrow format
      Rust
      Apache License 2.0
      222100Updated Apr 28, 2024Apr 28, 2024
    • Convert sequences of Rust objects to Arrow tables
      Rust
      MIT License
      20000Updated Apr 4, 2024Apr 4, 2024
    • Benchmarking of distributed query engines
      Python
      Apache License 2.0
      0500Updated Dec 12, 2023Dec 12, 2023
    • Rust
      Apache License 2.0
      10000Updated Nov 29, 2023Nov 29, 2023
    • ludwig

      Public
      Data-centric declarative deep learning framework
      Python
      Apache License 2.0
      1.2k001Updated Oct 26, 2023Oct 26, 2023
    • Code for generating tables of data and tabular files (CSV, JSON, Parquet etc) for testing
      Thrift
      Apache License 2.0
      0500Updated Jul 25, 2023Jul 25, 2023
    • Demonstration of Daft on Flyte
      Apache License 2.0
      0100Updated Jul 7, 2023Jul 7, 2023
    • icebridge

      Public
      A Python Bridge to Apache Iceberg using Py4J
      Python
      Apache License 2.0
      2000Updated Sep 27, 2022Sep 27, 2022
    • MNIST data in JSON format
      0100Updated Sep 1, 2022Sep 1, 2022
    • Kubernetes spawner for JupyterHub in the Eventual Hub
      Python
      BSD 3-Clause "New" or "Revised" License
      304100Updated Jul 23, 2022Jul 23, 2022