⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
Updated
Jan 12, 2025 - Python
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Code review for data in dbt
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Open Source Data Quality Monitoring.
re_data - fix data issues before your users & CEO would discover them 😊
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
Swiple enables you to easily observe, understand, validate and improve the quality of your data
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring
Open-source metadata collector based on ODD Specification
DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.
DataOps Observability Integration Agents are part of DataKitchen's Open Source Data Observability. They connect to various ETL, ELT, BI, data science, data visualization, data governance, and data analytic tools. They provide logs, messages, metrics, overall run-time start/stop, subtask status, and scheduling information to DataOps Observability.
⚡ Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.
Open-source GCP metadata collector based on ODD Specification
Códigos, plataformas, ferramentas e processos em alta;
A simple to use EventEmitter and Data-Observer python package.
Automatically validate datasets, poll task status, and display validation results in a GitHub using Swiple pull request.
Add a description, image, and links to the data-observability topic page so that developers can more easily learn about it.
To associate your repository with the data-observability topic, visit your repo's landing page and select "manage topics."