-
Notifications
You must be signed in to change notification settings - Fork 199
Open
Description
Background and Motivation
Our current CI uses auron-project/tpcds-validator, which runs TPC-DS queries and passes them solely on the absence of exceptions. This is too permissive and misses critical regressions:
- Result correctness: Query results diverging from vanilla Spark.
- Plan stability: Unintended loss of native operators, unstable plans, or fallback.
Gating only on "no crash" in TPC-DS tests isn't enough as Auron evolves.
We need CI to enforce result equivalence and plan stability to protect core optimizations.
Proposal
Introduce a standalone integration test module auron-it that:
- Executes each TPC-DS query in two modes: vanilla Spark and Auron.
- Compares results (exact match or small floating-point tolerance). Mismatches fail the CI and produce concise, readable diffs.
- Validates Auron’s physical plans against versioned “gold files” to enforce plan stability and detect operator loss or unintended fallbacks.
Maintainability and CI Integration
- Fully decouples TPC-DS tests from unit tests by placing them in the dedicated auron-it module — easier to maintain and evolve.
- Tests run only when the TPCDS_DATA_DIR env var is set, keeping them optional yet easy to enable in CI and convenient for local development and debugging.
zuston
Metadata
Metadata
Assignees
Labels
No labels