⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
Updated
Jun 11, 2025 - Python
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Code review for data in dbt
Data validation made beautiful and powerful
Great Expectations Airflow operator
re_data - fix data issues before your users & CEO would discover them 😊
A simple and easy to use Data Validation library for Python.
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring
⚡ Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.
This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.
data and pipeline testing with and for SQL
Spark Data Test - A PySpark-based automation testing utility to compare Spark DataFrames
A data testing framework that executes queries on configurable data providers and validates the results with customizable YAML-defined assertions. Ensure data integrity, consistency, and reliability effortlessly.
I'm learning how to use dbt with BigQuery so I can apply that knowledge wherever we end up working. It seems like a good DWH interface tool to know for data transformation and testing, and allows me to solidify concepts of testing in data ops.
Add a description, image, and links to the data-testing topic page so that developers can more easily learn about it.
To associate your repository with the data-testing topic, visit your repo's landing page and select "manage topics."