Skip to content

Conversation

@tianzhou
Copy link
Contributor

Replace parser-based approach with embedded PostgreSQL for validating and
inspecting desired state schemas during plan generation. This ensures test
and production code paths are identical, improving reliability and reducing
maintenance burden.

Key Changes

Architecture:

  • Remove SQL parser implementation (~3,800 lines)
    • Delete ir/parser.go and ir/parser_test.go
    • Parser was complex, fragile, and diverged from production behavior
  • Add embedded PostgreSQL for plan generation
    • New cmd/util/embedded_postgres.go with lifecycle management
    • Temporary instances with unique runtime paths and dynamic ports
    • Automatic cleanup on completion
  • Plan generation now:
    1. Starts temporary embedded PostgreSQL instance
    2. Applies desired state SQL to embedded PostgreSQL
    3. Inspects embedded PostgreSQL using existing inspector code
    4. Generates diff between current DB and desired state IR
    5. Cleans up embedded PostgreSQL

Benefits:

  • Test and production use identical code path (inspector)
  • No parser/inspector parity issues
  • Validates desired state SQL is actually valid PostgreSQL
  • Catches SQL errors during plan generation instead of apply
  • Removes ~5,300 lines of parser code and tests

Test Fixes:

  • Fix TestApplyCommand_CreateIndexConcurrently
    • Remove CONCURRENTLY from desired state (migration detail, not state)
    • Desired state uses plain CREATE INDEX
    • Diff engine generates CREATE INDEX CONCURRENTLY in migration plan
  • Fix TestApplyCommand_TransactionRollback
    • Generate valid plan from valid desired state
    • Manually inject failing SQL into plan JSON
    • Apply modified plan via --plan flag to test rollback
  • Fix TestIgnoreIntegration/plan
    • Add user_status type definition (required for valid SQL)
  • Fix dependency test data ordering
    • testdata/diff/dependency/function_to_trigger/new.sql: function before trigger
    • testdata/diff/dependency/table_to_function/new.sql: table before function
    • testdata/diff/dependency/table_to_table/new.sql: referenced table first
    • User-supplied schemas must have correct dependency ordering
    • Diff engine handles dependency ordering in generated migrations

Known Issues:

  • TestIgnoreIntegration/plan_trigger_on_ignored_table still fails
    • Requires IsExternal flag support in inspector (was parser-only)
    • Needs follow-up to mark ignored tables as external after inspection

Infrastructure:

  • Add PostgreSQL version selection via PGSCHEMA_POSTGRES_VERSION
  • Support versions: 14, 15, 16, 17 (default: 17)
  • Update inspector queries for better compatibility
  • Improve normalization for inspector-parsed schemas

…n command

Replace parser-based approach with embedded PostgreSQL for validating and
inspecting desired state schemas during plan generation. This ensures test
and production code paths are identical, improving reliability and reducing
maintenance burden.

## Key Changes

**Architecture:**
- Remove SQL parser implementation (~3,800 lines)
  - Delete ir/parser.go and ir/parser_test.go
  - Parser was complex, fragile, and diverged from production behavior
- Add embedded PostgreSQL for plan generation
  - New cmd/util/embedded_postgres.go with lifecycle management
  - Temporary instances with unique runtime paths and dynamic ports
  - Automatic cleanup on completion
- Plan generation now:
  1. Starts temporary embedded PostgreSQL instance
  2. Applies desired state SQL to embedded PostgreSQL
  3. Inspects embedded PostgreSQL using existing inspector code
  4. Generates diff between current DB and desired state IR
  5. Cleans up embedded PostgreSQL

**Benefits:**
- Test and production use identical code path (inspector)
- No parser/inspector parity issues
- Validates desired state SQL is actually valid PostgreSQL
- Catches SQL errors during plan generation instead of apply
- Removes ~5,300 lines of parser code and tests

**Test Fixes:**
- Fix TestApplyCommand_CreateIndexConcurrently
  - Remove CONCURRENTLY from desired state (migration detail, not state)
  - Desired state uses plain CREATE INDEX
  - Diff engine generates CREATE INDEX CONCURRENTLY in migration plan
- Fix TestApplyCommand_TransactionRollback
  - Generate valid plan from valid desired state
  - Manually inject failing SQL into plan JSON
  - Apply modified plan via --plan flag to test rollback
- Fix TestIgnoreIntegration/plan
  - Add user_status type definition (required for valid SQL)
- Fix dependency test data ordering
  - testdata/diff/dependency/function_to_trigger/new.sql: function before trigger
  - testdata/diff/dependency/table_to_function/new.sql: table before function
  - testdata/diff/dependency/table_to_table/new.sql: referenced table first
  - User-supplied schemas must have correct dependency ordering
  - Diff engine handles dependency ordering in generated migrations

**Known Issues:**
- TestIgnoreIntegration/plan_trigger_on_ignored_table still fails
  - Requires IsExternal flag support in inspector (was parser-only)
  - Needs follow-up to mark ignored tables as external after inspection

**Infrastructure:**
- Add PostgreSQL version selection via PGSCHEMA_POSTGRES_VERSION
- Support versions: 14, 15, 16, 17 (default: 17)
- Update inspector queries for better compatibility
- Improve normalization for inspector-parsed schemas
Copilot AI review requested due to automatic review settings October 21, 2025 10:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the plan generation process to use embedded PostgreSQL for validating and inspecting desired state schemas, replacing the previous SQL parser-based approach. This ensures test and production code paths are identical by having both desired and current states extracted through database inspection.

Key changes:

  • Removes SQL parser implementation (~3,800 lines) and replaces it with embedded PostgreSQL for plan generation
  • Adds embedded PostgreSQL lifecycle management with automatic cleanup
  • Updates test data files to fix dependency ordering issues and adjust for inspector-based schema extraction
  • Adds shared test containers to improve test performance

Reviewed Changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
testdata/diff/migrate/v5/new.sql Reorders trigger definition to appear after the table it references
testdata/diff/migrate/v4/diff.sql Updates view definitions with explicit type casts from inspector output
testdata/diff/dependency/table_to_table/new.sql Reorders tables to define referenced table first
testdata/diff/dependency/table_to_function/new.sql Reorders to define table before function that references it
testdata/diff/dependency/function_to_trigger/new.sql Reorders to define function before trigger that uses it
testdata/diff/create_view/*/diff.sql Updates view definitions with explicit type casts and formatting from inspector
testdata/diff/create_trigger/*/diff.sql Updates trigger WHEN clauses with additional parentheses from inspector
testdata/diff/create_table/add_table_like_forward_ref/new.sql Reorders tables as forward references not supported with embedded postgres
testdata/diff/create_table/add_table_like/diff.sql Updates constraint and comment names from inspector output
testdata/diff/create_table/add_column_generated/diff.sql Updates generated column expressions with explicit casts
testdata/diff/create_materialized_view/*/diff.sql Updates materialized view definitions with explicit type casts
ir/testutil.go Adds shared test container support and ParseSQLForTest helper
ir/queries/queries.sql.go Updates trigger query to fetch additional metadata from pg_catalog
ir/queries/queries.sql Updates trigger query SQL to include more trigger properties
ir/parser_test.go Deletes entire parser test file (~1,500 lines)
ir/normalize.go Adds documentation explaining historical context of normalization
ir/inspector.go Rewrites trigger building to decode pg_trigger catalog data directly
internal/plan/plan_test.go Adds TestMain for shared container and updates parseSQL to use embedded postgres
internal/diff/diff_test.go Adds TestMain for shared container and updates parseSQL to use embedded postgres
cmd/util/postgres_version_test.go Adds tests for PostgreSQL version detection and mapping
cmd/util/postgres_version.go Adds PostgreSQL version detection and mapping functions
cmd/util/embedded_postgres_test_helper.go Adds helper for setting up shared embedded postgres in tests
cmd/util/embedded_postgres_test.go Adds comprehensive tests for embedded PostgreSQL functionality
cmd/util/embedded_postgres.go Adds embedded PostgreSQL lifecycle management
cmd/plan/plan.go Updates plan generation to use embedded PostgreSQL instead of parser
cmd/migrate_integration_test.go Adds TestMain with shared embedded postgres instance
cmd/ignore_integration_test.go Adds missing type definition and external table for tests
cmd/apply/apply_integration_test.go Fixes rollback test to inject failing SQL into valid plan
CLAUDE.md Updates architecture documentation to reflect parser removal

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@tianzhou tianzhou requested a review from Copilot October 21, 2025 17:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 75 out of 76 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

// Extract test name and create unique runtime path
testName := strings.ReplaceAll(t.Name(), "/", "_") // Replace slashes for subtest names
timestamp := time.Now().Format("20060102_150405_999999")
timestamp := time.Now().Format("20060102_150405.000000000")
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The timestamp format change from milliseconds to nanoseconds improves uniqueness, but the format string uses '.' instead of '_' as separator which is inconsistent with the date/time portion. Consider using '20060102_150405_000000000' for consistency, or document why the dot separator was chosen.

Suggested change
timestamp := time.Now().Format("20060102_150405.000000000")
timestamp := time.Now().Format("20060102_150405_000000000")

Copilot uses AI. Check for mistakes.
if t != nil {
testName = strings.ReplaceAll(t.Name(), "/", "_") // Replace slashes for subtest names
}
timestamp := time.Now().Format("20060102_150405.000000000")
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Same timestamp format inconsistency as in testutil/postgres.go - uses '.' separator instead of '_' which is inconsistent with the rest of the timestamp format.

Suggested change
timestamp := time.Now().Format("20060102_150405.000000000")
timestamp := time.Now().Format("20060102_150405_000000000")

Copilot uses AI. Check for mistakes.
log := logger.Get()

// Create unique runtime path with timestamp (using nanoseconds for uniqueness)
timestamp := time.Now().Format("20060102_150405.000000000")
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Same timestamp format inconsistency - uses '.' separator instead of '_' which is inconsistent with other parts of the timestamp.

Suggested change
timestamp := time.Now().Format("20060102_150405.000000000")
timestamp := time.Now().Format("20060102_150405_000000000")

Copilot uses AI. Check for mistakes.
@tianzhou tianzhou merged commit aa80712 into main Oct 21, 2025
2 checks passed
@tianzhou
Copy link
Contributor Author

Fix #103

@tianzhou
Copy link
Contributor Author

Fix #106

@tianzhou
Copy link
Contributor Author

Fix #104

@tianzhou tianzhou deleted the use_embeded_postgres branch October 23, 2025 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant