YTsaurus is a scalable and fault-tolerant open-source big data platform.
-
Updated
Oct 14, 2024 - C++
YTsaurus is a scalable and fault-tolerant open-source big data platform.
ClickHouse® is a real-time analytics DBMS
Simple and Distributed Machine Learning
A powerful open-source IoT simulation framework for simulating and analyzing devices written in Python. Supporting MQTT,HTTP protocols. Designed for versatility, ease of use, and extensive IoT experimentation
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
Data-Centric Pipelines and Data Versioning
curated list of awesome tools and libraries for specific domains
Coursework written for: Astroinformatics – Astrostatistics and Machine Learning in Astronomy, MASS course offered at the University of Belgrade
AI + Data, online. https://vespa.ai
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
SageWorks: An easy to use Python API for creating and deploying AWS SageMaker Models
Server for the ListenBrainz project, including the front-end (javascript/react) code that it serves and all of the data processing components that LB uses.
Postgres for Search and Analytics
A large-scale entity and relation database supporting aggregation of properties
Apache Ignite
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."