The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
Updated
Jun 3, 2025 - Python
DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Scalable and efficient data transformation framework - backwards compatible with dbt.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
One framework to develop, deploy and operate data workflows with Python and SQL.
The data-validation toolkit for enhanced dbt (data build tool) PR review
😎 A curated list of awesome DataOps tools
Open Source Data Quality Monitoring.
Unified storage framework for the entire machine learning lifecycle
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
Perform transformations on your data with natural language using LLMs
Interactive computing for complex data processing, modeling and analysis in Python 3
End-to-end DataOps platform deployed by Terraform.
DataOps framework for Machine Learning projects.
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring
Run LLM-related tools in containers.
DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.