Skip to content

feat: Integration Tests for Denormalization #671

@vaibhav-datazip

Description

@vaibhav-datazip

Problem

Currently, the codebase supports both normalized and denormalized data storage modes through the normalization flag in stream metadata, but lacks comprehensive integration tests that validate denormalization behavior across different drivers and scenarios. This gap creates several problems:

  1. Insufficient End-to-End Validation: While the codebase has logic to handle denormalized data storage (storing raw JSON in StringifiedData column when normalization=false), there are no integration tests that verify this works correctly from source to destination. Without these tests, we cannot be certain that denormalized data is correctly:

    • Stored as JSON strings in the destination
    • Preserved with all nested structures intact
    • Readable and parseable after storage
    • Handled correctly during CDC operations (insert, update, delete)
  2. Driver-Specific Behavior Unknown: Different drivers (MySQL, Postgres, MongoDB) have different data structures:

    • Relational drivers (MySQL, Postgres) default to normalization=true and store flattened columns
    • Non-relational drivers (MongoDB) default to normalization=false and store raw JSON
    • Without integration tests, we cannot verify that denormalization works correctly when explicitly set to false for relational drivers or when set to true for non-relational drivers

etc.
Without proper integration tests, users cannot be confident that denormalized mode works as expected for their use cases.

Solution

Architecture

The integration tests should follow the existing test structure in utils/testutils/test_utils.go and extend the IntegrationTest type to support denormalization testing scenarios.

Expected Outcomes

  • Comprehensive integration test coverage for denormalization functionality across MySQL, Postgres, and MongoDB
  • Validation that denormalized data is correctly stored as JSON strings
  • Verification that complex nested data structures are preserved in denormalized mode

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions