Skip to content

Conversation

@tianzhou
Copy link
Contributor

Previously, pgschema stripped all type casts from string literals in function parameter defaults, including semantically required ones like '1 year'::interval. This caused generated SQL to fail with: "ERROR: operator is not unique: date + unknown"

Changes:

  • Modified normalize.go to only strip truly redundant casts (text, varchar, char, json, jsonb) while preserving required ones (interval, date, custom enums, etc.)
  • Extended stripSchemaPrefix() to handle type casts within expressions
  • Added stripTempSchemaPrefix() for temporary embedded postgres schemas
  • Updated test files to include interval parameter and preserve custom enum type casts

Fixes #216

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings December 29, 2025 08:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug where interval type casts were incorrectly stripped from function parameter defaults, causing SQL execution errors. The fix modifies the normalization logic to only strip truly redundant type casts while preserving semantically required ones.

  • Modified ir/normalize.go to selectively strip only redundant type casts (text, varchar, char, json, jsonb) while preserving required ones (interval, date, custom enums)
  • Added helper functions in internal/diff/table.go to handle schema prefix stripping in type casts and temporary schema cleanup
  • Updated test data to verify interval parameters and custom enum type casts are correctly preserved

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
ir/normalize.go Updated regex pattern to only strip redundant type casts, preserving interval and custom enum casts
internal/diff/table.go Added stripSchemaPrefix extension for type casts in expressions and new stripTempSchemaPrefix helper
internal/diff/function.go Integrated schema prefix stripping for function parameter default values
testdata/diff/migrate/v5/diff.sql Test case preserving custom enum type cast in ALTER TABLE
testdata/diff/create_table/alter_column_types/diff.sql Test case preserving action_type enum cast in SET DEFAULT
testdata/diff/create_table/add_column_cross_schema_custom_type/diff.sql Test case preserving cross-schema custom enum cast
testdata/diff/create_function/alter_function_same_signature/. Test cases preserving custom enum casts in function parameters
testdata/diff/create_function/add_function/. Test cases adding interval parameter to reproduce issue #216

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 12 to +58
func stripSchemaPrefix(typeName, targetSchema string) string {
if typeName == "" || targetSchema == "" {
return typeName
}

// Check if the type has the target schema prefix
// Check if the type has the target schema prefix at the beginning
prefix := targetSchema + "."
if after, found := strings.CutPrefix(typeName, prefix); found {
return after
}

// Also handle type casts within expressions: ::schema.typename -> ::typename
// This is needed for function parameter default values like 'value'::schema.enum_type
castPrefix := "::" + targetSchema + "."
if strings.Contains(typeName, castPrefix) {
return strings.ReplaceAll(typeName, castPrefix, "::")
}

return typeName
}

// stripTempSchemaPrefix removes temporary embedded postgres schema prefixes (pgschema_tmp_*).
// These are used internally during plan generation and should not appear in output DDL.
func stripTempSchemaPrefix(value string) string {
if value == "" {
return value
}

// Pattern: ::pgschema_tmp_YYYYMMDD_HHMMSS_XXXXXXXX.typename -> ::typename
// We look for ::pgschema_tmp_ followed by anything until the next dot
idx := strings.Index(value, "::pgschema_tmp_")
if idx == -1 {
return value
}

// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}

// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
return strings.ReplaceAll(value, prefix, "::")
}
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new stripSchemaPrefix and stripTempSchemaPrefix helper functions lack unit tests. Given the complexity of these functions and the edge cases they need to handle (multiple occurrences, off-by-one errors, nested expressions), comprehensive unit tests would help ensure correctness and prevent regressions.

Consider adding unit tests that cover:

  • Simple type name prefix stripping
  • Type casts within expressions
  • Multiple occurrences of schema prefixes
  • Edge cases like empty strings, missing dots, etc.
  • Temporary schema prefix patterns with various timestamp formats

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +56
idx := strings.Index(value, "::pgschema_tmp_")
if idx == -1 {
return value
}

// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}

// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded offset of 15 in value[idx+15:] assumes the literal string "::pgschema_tmp_" is exactly 15 characters. However, "::pgschema_tmp_" is actually 16 characters (2 colons + "pgschema_tmp_" which is 14 characters). This off-by-one error means the function is starting its search for the dot one character too early, which could cause it to miss or incorrectly handle the schema prefix.

The correct offset should be 16, not 15. Also consider using len("::pgschema_tmp_") instead of a magic number for better maintainability.

Suggested change
idx := strings.Index(value, "::pgschema_tmp_")
if idx == -1 {
return value
}
// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}
// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
tempPrefix := "::pgschema_tmp_"
idx := strings.Index(value, tempPrefix)
if idx == -1 {
return value
}
// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+len(tempPrefix):], ".")
if dotIdx == -1 {
return value
}
// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+len(tempPrefix)+dotIdx+1] // includes the trailing dot

Copilot uses AI. Check for mistakes.
Comment on lines +50 to +56
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}

// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same off-by-one error exists here. The calculation idx+15+dotIdx+1 uses 15 instead of the correct length of "::pgschema_tmp_" which is 16. This will cause the prefix extraction to include one extra character before the "::" or exclude one character that should be included.

The correct calculation should be idx + len("::pgschema_tmp_") + dotIdx + 1 or idx + 16 + dotIdx + 1.

Suggested change
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}
// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
dotIdx := strings.Index(value[idx+len("::pgschema_tmp_"):], ".")
if dotIdx == -1 {
return value
}
// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+len("::pgschema_tmp_")+dotIdx+1] // includes the trailing dot

Copilot uses AI. Check for mistakes.
ir/normalize.go Outdated
//
// Pattern matches redundant text/varchar/char/json casts (including arrays)
// Note: jsonb must come before json to avoid partial match
re = regexp.MustCompile(`('(?:[^']|'')*')::(text|character varying|character|bpchar|varchar|jsonb|json)(?:\[\])?`)
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern for redundant type casts uses a non-capturing group for the array brackets (?:\[\])?, but this pattern will only match exactly [] and won't match multi-dimensional arrays like [][]. PostgreSQL supports multi-dimensional arrays, so a cast like '{}'::text[][] would not be properly handled by this pattern.

Consider using (?:\[\])* instead of (?:\[\])? to match zero or more array dimensions, or (?:\[\]+) to match one or more pairs of brackets.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +29
if strings.Contains(typeName, castPrefix) {
return strings.ReplaceAll(typeName, castPrefix, "::")
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using strings.ReplaceAll will replace all occurrences of the schema prefix in the type name, which could cause issues if the schema name appears in other parts of the expression. For example, if targetSchema is "public" and typeName is "CASE WHEN public.status = 'active'::public.status_enum THEN 'yes'::public.yes_no END", this would replace all instances of "::public." with "::", resulting in malformed SQL.

Consider using a more targeted approach that only replaces the specific occurrence related to the type cast, or add additional validation to ensure you're only replacing type cast contexts.

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +57
// We look for ::pgschema_tmp_ followed by anything until the next dot
idx := strings.Index(value, "::pgschema_tmp_")
if idx == -1 {
return value
}

// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}

// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
return strings.ReplaceAll(value, prefix, "::")
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the issue in stripSchemaPrefix, using strings.ReplaceAll here could cause problems if there are multiple temporary schema prefixes in the value, or if the extracted prefix accidentally matches other parts of the expression.

Additionally, this function only handles the first occurrence it finds (since it searches from the beginning), but then uses ReplaceAll which will replace all occurrences. This creates an inconsistency: if there are multiple different temporary schema prefixes (e.g., ::pgschema_tmp_20231201_120000_ABC.type1 and ::pgschema_tmp_20231201_130000_XYZ.type2), the function will only extract the prefix for the first one but attempt to replace all occurrences of that prefix, potentially leaving other temporary prefixes in place.

Consider either using strings.Replace(value, prefix, "::", 1) to only replace the first occurrence, or looping to handle all temporary schema prefixes in the value.

Suggested change
// We look for ::pgschema_tmp_ followed by anything until the next dot
idx := strings.Index(value, "::pgschema_tmp_")
if idx == -1 {
return value
}
// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+15:], ".")
if dotIdx == -1 {
return value
}
// Replace ::pgschema_tmp_XXX.typename with ::typename
prefix := value[idx : idx+15+dotIdx+1] // includes the trailing dot
return strings.ReplaceAll(value, prefix, "::")
// We look for ::pgschema_tmp_ followed by anything until the next dot.
const tmpPrefix = "::pgschema_tmp_"
for {
idx := strings.Index(value, tmpPrefix)
if idx == -1 {
break
}
// Find the dot after pgschema_tmp_*
dotIdx := strings.Index(value[idx+len(tmpPrefix):], ".")
if dotIdx == -1 {
// No terminating dot; nothing more to strip safely.
break
}
// Replace this specific ::pgschema_tmp_XXX.typename with ::typename
prefixEnd := idx + len(tmpPrefix) + dotIdx + 1 // includes the trailing dot
prefix := value[idx:prefixEnd]
value = strings.Replace(value, prefix, "::", 1)
}
return value

Copilot uses AI. Check for mistakes.
ir/normalize.go Outdated
Comment on lines 156 to 171
// IMPORTANT: Do NOT remove semantically significant casts like:
// - '1 year'::interval (interval literals REQUIRE the cast)
// - 'value'::my_enum (custom type casts)
// - '2024-01-01'::date (date literals need the cast in expressions)
//
// Pattern matches redundant text/varchar/char/json casts (including arrays)
// Note: jsonb must come before json to avoid partial match
re = regexp.MustCompile(`('(?:[^']|'')*')::(text|character varying|character|bpchar|varchar|jsonb|json)(?:\[\])?`)
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states that date literals like '2024-01-01'::date need the cast to be preserved in expressions, but the current regex pattern will strip ::date casts because it's not in the exclusion list. The regex only preserves casts for types NOT in the list (text|character varying|character|bpchar|varchar|jsonb|json).

This means that '2024-01-01'::date will NOT match the pattern and the cast will be preserved, which is correct. However, the comment is somewhat misleading because it suggests this is an explicit exclusion, when in reality it's simply not included in the redundant types list.

Consider clarifying the comment to make it clear that date, interval, and custom enum casts are preserved because they are NOT matched by this pattern, rather than implying they are actively excluded.

Copilot uses AI. Check for mistakes.
@tianzhou tianzhou force-pushed the interval branch 2 times, most recently from 1cc2592 to 732a333 Compare December 29, 2025 08:50
Previously, pgschema stripped all type casts from string literals in
function parameter defaults, including semantically required ones like
'1 year'::interval. This caused generated SQL to fail with:
"ERROR: operator is not unique: date + unknown"

Changes:
- Modified normalize.go to only strip truly redundant casts (text,
  varchar, char, json, jsonb) while preserving required ones (interval,
  date, custom enums, etc.)
- Added temp schema prefix stripping in IR normalization to ensure
  idempotent comparisons between desired and current state
- Extended stripSchemaPrefix() to handle type casts within expressions
- Added stripTempSchemaPrefix() for temporary embedded postgres schemas
- Updated test files to include interval parameter and preserve custom
  enum type casts

Fixes #216

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@tianzhou tianzhou merged commit b05d86a into main Dec 29, 2025
2 checks passed
tianzhou added a commit that referenced this pull request Jan 4, 2026
… (#217)

* fix: preserve interval type casts in function parameter defaults (#216)

Previously, pgschema stripped all type casts from string literals in
function parameter defaults, including semantically required ones like
'1 year'::interval. This caused generated SQL to fail with:
"ERROR: operator is not unique: date + unknown"

Changes:
- Modified normalize.go to only strip truly redundant casts (text,
  varchar, char, json, jsonb) while preserving required ones (interval,
  date, custom enums, etc.)
- Added temp schema prefix stripping in IR normalization to ensure
  idempotent comparisons between desired and current state
- Extended stripSchemaPrefix() to handle type casts within expressions
- Added stripTempSchemaPrefix() for temporary embedded postgres schemas
- Updated test files to include interval parameter and preserve custom
  enum type casts

Fixes #216

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: update schema dump

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
@tianzhou tianzhou deleted the interval branch January 5, 2026 14:12
alecthomas pushed a commit to alecthomas/pgschema that referenced this pull request Jan 26, 2026
…lex#216) (pgplex#217)

* fix: preserve interval type casts in function parameter defaults (pgplex#216)

Previously, pgschema stripped all type casts from string literals in
function parameter defaults, including semantically required ones like
'1 year'::interval. This caused generated SQL to fail with:
"ERROR: operator is not unique: date + unknown"

Changes:
- Modified normalize.go to only strip truly redundant casts (text,
  varchar, char, json, jsonb) while preserving required ones (interval,
  date, custom enums, etc.)
- Added temp schema prefix stripping in IR normalization to ensure
  idempotent comparisons between desired and current state
- Extended stripSchemaPrefix() to handle type casts within expressions
- Added stripTempSchemaPrefix() for temporary embedded postgres schemas
- Updated test files to include interval parameter and preserve custom
  enum type casts

Fixes pgplex#216

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: update schema dump

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Interval type cast stripped from function parameter defaults in migration SQL

1 participant