Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
分享 GitHub 上有趣、入门级的开源项目。Share interesting, entry-level open source projects on GitHub.
2023年最新总结,阿里,腾讯,百度,美团,头条等技术面试题目,以及答案,专家出题人分析汇总。
《Designing Data-Intensive Application》DDIA中文翻译
Open source platform for the machine learning lifecycle
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
An orchestration platform for the development, production, and observation of data assets.
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
A community driven list of useful Scala libraries, frameworks and software.
🏔️国立台湾大学、新加坡国立大学、早稻田大学、东京大学,中央研究院(台湾)以及中国重点高校及科研机构,社科、经济、数学、博弈论、哲学、系统工程类学术论文等知识库。
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation,…
Easy & Flexible Alerting With ElasticSearch
The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Standardized Serverless ML Inference Platform on Kubernetes
Applications self-hosting platform for running open source, web-based linux Panel of lite PaaS
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
The interoperable, open source catalog for Apache Iceberg
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Data Pipeline Framework using the singer.io spec
FeatHub - A stream-batch unified feature store for real-time machine learning
基于argo的云原生调度,项目管理,在线notebook,在线镜像构建,拖拉拽编排pipeline,定时调度,实例管理