Skip to content

Conversation

@p-c-h-b
Copy link
Contributor

@p-c-h-b p-c-h-b commented Nov 8, 2025

This fixes a bug where FK-heavy tables could be reordered alphabetically whenever Kahn's algorithm encountered a cycle in the dependency graph. This caused child tables to be created before parent tables, leading to "relation does not exist" errors.

Problem

When the topological sort encountered a cycle (e.g., two tables with mutual FK references), it would fall back to alphabetical sorting. This meant that tables like adjustment could be created before review, even though adjustment has a foreign key to review.

Solution

  1. Deterministic cycle breaking: Instead of falling back to alphabetical order, the algorithm now breaks cycles by selecting the next unprocessed table in the original insertion order.

  2. Deferred FK constraints: FK constraints that reference tables that haven't been created yet are now deferred and added via ALTER TABLE after all tables exist.

Changes

  • Modified topologicallySortTables() and topologicallySortViews() to track insertion order and use it for deterministic cycle breaking
  • Added deferredConstraint type to track FK constraints that need to be applied later
  • Modified generateCreateTablesSQL() to return deferred policies and constraints
  • Added generateDeferredConstraintsSQL() to apply FK constraints after all tables are created
  • Added shouldDeferConstraint() helper to determine if an FK should be deferred
  • Added test case TestTopologicallySortTablesHandlesCycles() covering cyclic dependencies

Test Case

The new test verifies:

  • Tables with linear dependencies are ordered correctly (a → b → c)
  • Tables with cyclic dependencies are handled without errors (x ↔ y)
  • Tables depending on cycles are still ordered after the cycle (z depends on y)
  • Cycle members are ordered deterministically

Fixes #155 (partially - this is the first of three fixes mentioned in the issue)

This fixes a bug where FK-heavy tables could be reordered alphabetically
whenever Kahn's algorithm encountered a cycle in the dependency graph.
This caused child tables to be created before parent tables, leading to
'relation does not exist' errors.

Changes:
- Modified topological sort to break cycles deterministically based on
  source order instead of falling back to alphabetical sorting
- Added deferred FK constraint handling in table creation
- FK constraints are now added via ALTER TABLE after all tables exist
- Added test case covering cyclic dependencies

Fixes pgplex#155
Copilot AI review requested due to automatic review settings November 8, 2025 03:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements cycle handling in topological sorting for tables and views, and introduces deferred foreign key constraint creation to support circular dependencies between tables.

  • Refactored topological sort to handle cycles by using insertion order as a deterministic tie-breaker
  • Added deferred constraint mechanism to apply foreign keys after all referenced tables are created
  • Deferred RLS policy creation until after functions/procedures exist to satisfy potential dependencies

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
internal/diff/topological.go Updated Kahn's algorithm to detect cycles and break them deterministically using insertion order; added nextInOrder helper function
internal/diff/topological_test.go Added test case verifying cycle handling and deterministic ordering in topological sort
internal/diff/table.go Introduced deferred constraint mechanism to handle foreign keys referencing not-yet-created tables; added generateDeferredConstraintsSQL and shouldDeferConstraint functions; modified generateTableSQL to return deferred constraints
internal/diff/diff.go Updated CREATE statement generation order to apply deferred constraints after table creation and deferred policies after functions/procedures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add validation for empty constraint names in generateDeferredConstraintsSQL
- Fix misleading comment about ordering strategy (insertion order, not alphabetical)
- Update migrate_v5 test fixtures to reflect deferred FK constraints

The test fixtures now correctly show FK constraints being added via
ALTER TABLE after table creation, which is the expected behavior.
The FK constraint is now a separate step (table.constraint) instead
of being inlined in the CREATE TABLE statement.
Add the constraint to the summary and show it as an ALTER TABLE
statement in the DDL section.
Copy link
Contributor Author

@p-c-h-b p-c-h-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both suggestions have been addressed in commit 5366345:

  1. Added validation for empty constraint names in generateDeferredConstraintsSQL
  2. Fixed the comment to accurately reflect insertion order instead of alphabetical order

- Update plan.txt to show constraint before indexes (matches actual output)
- Update employee_expected.sql to reflect deferred policy creation

Policies now appear earlier in dump output due to deferred creation
after functions/procedures.
Policies now appear earlier in dump output (line 47 vs 217) due to
deferred policy creation changes in FK constraint ordering fix.
@tianzhou tianzhou requested a review from Copilot November 8, 2025 03:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tianzhou
Copy link
Contributor

tianzhou commented Nov 8, 2025

  1. Let's also enhance the test case https://github.com/pgschema/pgschema/tree/main/testdata/diff/dependency/table_to_table to validate this fix.

  2. pls remove xxx_actual.sql, xxx_expected.sql as those are test artifacts when the test fails.

Updated the dependency/table_to_table test case to include a circular
foreign key dependency between departments and users tables:
- departments.manager_id references users.id
- users.department_id references departments.id

This validates that the FK constraint deferral fix properly handles
circular dependencies by deferring the constraint that would create
a cycle (departments_manager_id_fkey) to an ALTER TABLE statement.
@p-c-h-b
Copy link
Contributor Author

p-c-h-b commented Nov 8, 2025

Thanks for the feedback! I've addressed both comments:

  1. Enhanced the testdata/diff/dependency/table_to_table test case to validate the FK constraint deferral fix. The test now includes a circular foreign key dependency:

    • departments.manager_idusers.id
    • users.department_iddepartments.id

    This validates that constraints creating cycles are properly deferred to ALTER TABLE statements while non-cyclic constraints can remain inline.

  2. Removed all test artifact files (*_actual.sql and *_expected.sql).

All tests now pass.

@tianzhou
Copy link
Contributor

tianzhou commented Nov 9, 2025

@p-c-h-b have you forgot to push the update?

1. Remove test artifact files (*_actual.sql, *_actual.txt)

2. Fix employee_status_log FK constraint - move back inline since there's
   no circular dependency (employee_status_log → employee is one-way)

3. Fix root cause: Update shouldDeferConstraint() to check existing tables
   from old schema, not just newly created tables. FK constraints are now
   only deferred when the referenced table truly doesn't exist yet.

The table_to_table test demonstrates circular FK dependencies still work.
The topological sort was using insertion order as a tiebreaker for cycles,
but the insertion order itself was random due to Go's map iteration.

This caused CI to produce different (but equally valid) table orderings
than local runs, breaking the integration tests.

Fix by pre-sorting tables alphabetically before topological sort.
@p-c-h-b
Copy link
Contributor Author

p-c-h-b commented Nov 9, 2025

Rookie mistake! Should be good now.

@tianzhou tianzhou requested a review from Copilot November 10, 2025 02:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@tianzhou tianzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there.

ALTER TABLE audit ENABLE ROW LEVEL SECURITY;

--
-- Name: audit_insert_system; Type: POLICY; Schema: -; Owner: -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CRATE POLICY should be co-located with RLS in the above

@@ -0,0 +1,250 @@
--
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please exclude this from the PR

Fix policy ordering to emit CREATE POLICY statements immediately after
ALTER TABLE...ENABLE ROW LEVEL SECURITY for better readability and
logical grouping. Previously policies were deferred until after
functions/procedures, separating them from their RLS enable statements.

Changes:
- Policies now created immediately after RLS enable per table
- Removed deferred policy collection and late emission logic
- Updated employee test fixture with correct policy ordering
- Removed test artifact file (cmd/dump/employee_expected.sql)

This addresses tianzhou's feedback requesting policies be positioned
alongside their RLS statements rather than separated.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return lookup
}

var functionCallRegex = regexp.MustCompile(`(?i)([a-z_][a-z0-9_$.]*)\s*\(`)
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern allows $ as part of identifier characters, but PostgreSQL identifiers cannot contain $ in the middle. The pattern [a-z0-9_$.]* will incorrectly match things like foo$bar(. PostgreSQL only allows $ in quoted identifiers or at the start of dollar-quoted strings.

Consider using [a-z_][a-z0-9_]*(?:\.[a-z_][a-z0-9_]*)* to match schema-qualified identifiers correctly, or handle qualified names explicitly.

Suggested change
var functionCallRegex = regexp.MustCompile(`(?i)([a-z_][a-z0-9_$.]*)\s*\(`)
var functionCallRegex = regexp.MustCompile(`(?i)([a-z_][a-z0-9_]*(?:\.[a-z_][a-z0-9_]*)*)\s*\(`)

Copilot uses AI. Check for mistakes.
if next == "" {
break
}
queue = append(queue, next)
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting inDegree[next] = 0 when breaking a cycle may cause confusion as it doesn't reflect the actual dependency count. When processing neighbors on line 91-92, the check inDegree[neighbor] <= 0 could add tables with unresolved dependencies prematurely.

Consider commenting why this is safe (because we're breaking a cycle intentionally), or tracking which edges should be ignored due to cycle breaking rather than modifying in-degrees.

Suggested change
queue = append(queue, next)
queue = append(queue, next)
// Intentionally setting inDegree[next] = 0 to break cycles.
// This is safe because we are forcibly processing this node to ensure progress,
// even though its actual dependency count may be nonzero.

Copilot uses AI. Check for mistakes.
Comment on lines +573 to +576
refSchema := constraint.ReferencedSchema
if refSchema == "" {
refSchema = constraint.Schema
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ReferencedSchema is empty, the code falls back to constraint.Schema. While constraint.Schema should equal the table's schema, it would be more explicit and consistent to pass the table parameter to this function and use table.Schema directly as the fallback. This ensures the referenced table defaults to the same schema as the current table being created, even if constraint.Schema is somehow not populated correctly.

Consider changing the function signature to:

func shouldDeferConstraint(constraint *ir.Constraint, table *ir.Table, createdTables map[string]bool, existingTables map[string]bool) bool

And using table.Schema on line 575 instead of constraint.Schema.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

@tianzhou tianzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

@tianzhou tianzhou merged commit c425709 into pgplex:main Nov 11, 2025
@tianzhou
Copy link
Contributor

Follow up #162

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix dependency handling: FK ordering, schema normalization, and function replacements

2 participants