Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
-
Updated
Oct 2, 2024 - Scala
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
High performance data store solution
深圳地铁大数据客流分析系统🚇🚄🌟
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
extremely distributed machine learning
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
A re-implementation of Hadoop DistCP in Apache Spark
Rapid ETL/ELT-connectors/pipeline development leveraged on top of Apache Spark
Smart Automation Tool for building modern Data Lakes and Data Pipelines
phData Pulse application log aggregation and monitoring
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."