Skip to content

Conversation

@HavardHomb
Copy link
Contributor

@HavardHomb HavardHomb commented Oct 25, 2025

When SecureZodSchemaParser was introduced for very sensible security reasons, a lot of the existing features disappeared, which made it a not so advanced output parser as it was :)

This pull request extends the SecureZodSchemaParser to support parsing and building Zod union and literal types. In addition support for top level arrays was added back.

Summary by CodeRabbit

  • New Features
    • Top-level arrays, unions, and literals are now accepted and parsed (not just objects).
    • Inline modifiers (defaults, descriptions, etc.) supported on arrays, unions, objects, and literals.
    • Unions and literals are validated and handled as first-class schema types.
    • Public parsing API remains unchanged.

@HenryHengZJ
Copy link
Contributor

can you give some examples where it fails, and this PR fixes it?

@HavardHomb
Copy link
Contributor Author

HavardHomb commented Oct 28, 2025

These do not work and will be fixed

Union in an array

z.object({
  animals: z.array(
    z.union([
      z.object({
          __type: z.literal("Dog"),
          is_barking:z.boolean(),
        }),
        z.object({
          __type: z.literal("Cat"),
          is_meaowing: z.boolean(),
        })
      ])
    )
  )
})

Array as outermost element

z.array(
    z.union([
      z.object({
          __type: z.literal("Dog"),
          is_barking:z.boolean(),
        }),
        z.object({
          __type: z.literal("Cat"),
          is_meaowing: z.boolean(),
        })
      ])
    )
  )

@HenryHengZJ
Copy link
Contributor

HenryHengZJ commented Oct 29, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

Walkthrough

The secureZodParser.ts module now supports top-level z.array(...), z.union(...), and z.literal(...) schemas in addition to z.object(...). It introduces separate allowed base types and modifiers, dedicated parsing for unions and literals, unified modifier application, and extended build/validation flows for these types.

Changes

Cohort / File(s) Summary
Extended Zod Schema Parser
packages/components/src/secureZodParser.ts
Replaces single ALLOWED_TYPES with ALLOWED_BASE_TYPES and ALLOWED_MODIFIERS; adds top-level handling for z.array(...), z.union(...), and z.literal(...); introduces parseUnion, parseLiteral, extractUnionWithModifiers, extractLiteralWithModifiers, splitUnionMembers, parseArguments, parseArrayContent, parseObjectLiteral, splitProperties, parseProperty; extends object parsing to accept modifiers and defaults; adds applyModifiers mechanism; updates buildZodSchema/buildZodType to construct unions and literals; expands validateTypeInfo to handle unions and literals.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant Parser as secureZodParser
    participant Router as Type Router
    participant Obj as Object Parser
    participant Arr as Array Parser
    participant Union as Union Parser
    participant Lit as Literal Parser
    participant Builder as Builder
    participant Validator as Validator

    Client->>Parser: parseZodSchema(schema)
    Parser->>Router: detect top-level zod type
    alt z.object(...)
        Router->>Obj: parseObject + extract modifiers
        Obj->>Parser: return TypeInfo
    else z.array(...)
        Router->>Arr: parseArrayContent + modifiers
        Arr->>Parser: return TypeInfo
    else z.union(...)
        Router->>Union: extractUnionWithModifiers
        Union->>Union: splitUnionMembers
        loop for each member
            Union->>Parser: parse member (recursive)
        end
        Union->>Parser: return TypeInfo
    else z.literal(...)
        Router->>Lit: extractLiteralWithModifiers
        Lit->>Parser: return TypeInfo
    end

    Parser->>Builder: buildZodSchema(TypeInfo)
    Builder->>Builder: buildZodType + applyModifiers
    Builder->>Validator: validateTypeInfo / final checks
    Validator->>Client: return validated schema
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay extra attention to:
    • splitUnionMembers and union-member parsing for nested structures and edge cases.
    • Modifier extraction functions (extract*WithModifiers) and applyModifiers for consistent application across types.
    • buildZodType changes for union and literal construction and modifier application.
    • Validation updates in validateTypeInfo for union recursion and literal handling.

Poem

🐰 I hopped through schemas, crisp and spry,
Unions, arrays, literals — oh my!
Modifiers tucked in every nook,
I parsed each branch with a careful look.
Hooray — more types for this curious rabbit to try! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title Check ❓ Inconclusive The PR title "fix: Improvements to SecureZodSchemaParser" mentions the target file/component but uses the non-descriptive term "Improvements" to characterize the changes. While the title is related to the changeset, it lacks specificity about what was actually improved. The raw summary indicates substantial feature additions including support for unions, literals, top-level arrays, and enhanced modifier parsing—all of which are meaningful enhancements to address the lost functionality of SecureZodSchemaParser. However, the title does not convey these specifics and would not clearly communicate the main changes to someone scanning the commit history. Consider revising the title to be more specific about the key changes, such as "Add support for unions and literals in SecureZodSchemaParser" or "Support top-level arrays and unions in SecureZodSchemaParser". This would better communicate the primary objectives of restoring advanced output parsing capabilities while maintaining clarity for reviewers and future maintainers.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
packages/components/src/secureZodParser.ts (2)

63-71: Top-level array/union entry points: good. Consider parity for top-level literal.

Adding z.array(...) and z.union(...) at the root matches PR goals. Consider allowing top‑level z.literal(...) too for completeness, or explicitly document it as unsupported.

Also applies to: 74-75


352-368: Optional: broaden numeric literal support (scientific/decimal variants).

Current regexes only accept plain ints and decimals. If needed, accept 1e-3, .5, 5., etc.

Example tweak:

-        } else if (literalContent.match(/^-?\d+$/)) {
-            // Integer literal
-            literalValue = parseInt(literalContent, 10)
-        } else if (literalContent.match(/^-?\d+\.\d+$/)) {
-            // Float literal
-            literalValue = parseFloat(literalContent)
+        } else if (/^-?(?:\d+\.?\d*|\d*\.\d+)(?:[eE][+-]?\d+)?$/.test(literalContent)) {
+            // Number literal (int/float/scientific)
+            literalValue = Number(literalContent)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f94d61 and 93b8b3e.

📒 Files selected for processing (1)
  • packages/components/src/secureZodParser.ts (7 hunks)
🔇 Additional comments (6)
packages/components/src/secureZodParser.ts (6)

201-209: Early dispatch for union/literal is correct.

Fast‑pathing z.union(...) and z.literal(...) before the generic splitter avoids false splits on dots inside args.


297-335: Union parsing/building looks solid.

Member splitting accounts for nested structures; validation enforces ≥2 members. Build uses a typed tuple and applies modifiers post‑build.


547-553: Union/literal validation branches are correct.

Early returns prevent fall‑through to base/modifier validation.

Also applies to: 555-559


816-821: Top-level non-object dispatch is correct.

Covers arrays (simple/object), unions, literals; routes through buildZodType consistently.


871-894: Union/literal builders are sound; modifiers applied post‑construction.

Tuple-typing for z.union and clean z.literal(value) construction look good.

Also applies to: 896-906


257-295: Nice: array path handles modifiers and nested objects correctly.

extractArrayWithModifiers -> parseArray composition is clean; modifiers reattached and applied in build.

Also applies to: 631-680

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/components/src/secureZodParser.ts (1)

660-709: Critical: Array modifier parsing still uses the broken .split('.') approach.

While extractLiteralWithModifiers and extractUnionWithModifiers were updated to use stateful parsing, extractArrayWithModifiers still uses the problematic .split('.') approach (line 690). This will fail on:

  • .describe("A.B") (dot inside string)
  • .default({ key: "A.B", arr: [1,2] }) (nested braces)

This is inconsistent with the other extractors and breaks the same cases that were addressed elsewhere.

Apply this diff to use stateful parsing like the other extractors:

-        // Parse modifiers
-        const modifiers: any[] = []
-        const modifierParts = remainingPart.substring(1).split('.')
-
-        for (const part of modifierParts) {
-            const modMatch = part.match(/^(\w+)(\(.*\))?$/)
-            if (!modMatch) {
-                throw new Error(`Invalid modifier: ${part}`)
-            }
-
-            const modName = modMatch[1]
-            const modArgs = modMatch[2] ? this.parseArguments(modMatch[2]) : []
-
-            if (!this.ALLOWED_MODIFIERS.includes(modName as any)) {
-                throw new Error(`Unsupported modifier: ${modName}`)
-            }
-
-            modifiers.push({ name: modName, args: modArgs })
-        }
+        // Parse modifiers (handles nested args and quoted strings)
+        const modifiers: any[] = []
+        let i = 1 // skip leading '.'
+        while (i < remainingPart.length) {
+            const nameMatch = remainingPart.substring(i).match(/^(\w+)/)
+            if (!nameMatch) break
+            const modName = nameMatch[1]
+            i += modName.length
+            let modArgs: any[] = []
+            if (i < remainingPart.length && remainingPart[i] === '(') {
+                let depth = 0, inStr = false, strCh = '', start = i
+                for (let j = i; j < remainingPart.length; j++) {
+                    const ch = remainingPart[j]
+                    if (inStr) {
+                        if (ch === strCh && remainingPart[j - 1] !== '\\') inStr = false
+                    } else if (ch === '"' || ch === "'") {
+                        inStr = true; strCh = ch
+                    } else if (ch === '(') depth++
+                    else if (ch === ')') {
+                        depth--
+                        if (depth === 0) {
+                            const argsStr = remainingPart.substring(start, j + 1)
+                            // Allow object defaults and complex args
+                            modArgs = this.parseComplexArguments(argsStr)
+                            i = j + 1
+                            break
+                        }
+                    }
+                }
+            }
+            if (!this.ALLOWED_MODIFIERS.includes(modName as any)) {
+                throw new Error(`Unsupported modifier: ${modName}`)
+            }
+            modifiers.push({ name: modName, args: modArgs })
+            if (i < remainingPart.length && remainingPart[i] === '.') i++
+        }
🧹 Nitpick comments (1)
packages/components/src/secureZodParser.ts (1)

372-559: Consider extracting shared modifier parsing logic.

The stateful modifier parsing code is duplicated across extractLiteralWithModifiers, extractUnionWithModifiers, and (after fixing) extractArrayWithModifiers. Extracting this into a shared helper method would improve maintainability and ensure consistent behavior.

Example refactor:

private static parseModifiersFromString(remainingPart: string): { modifiers: any[]; hasModifiers: boolean } {
    if (!remainingPart.startsWith('.')) {
        return { modifiers: [], hasModifiers: false }
    }
    
    const modifiers: any[] = []
    let i = 1 // skip leading '.'
    while (i < remainingPart.length) {
        const nameMatch = remainingPart.substring(i).match(/^(\w+)/)
        if (!nameMatch) break
        const modName = nameMatch[1]
        i += modName.length
        let modArgs: any[] = []
        if (i < remainingPart.length && remainingPart[i] === '(') {
            let depth = 0, inStr = false, strCh = '', start = i
            for (let j = i; j < remainingPart.length; j++) {
                const ch = remainingPart[j]
                if (inStr) {
                    if (ch === strCh && remainingPart[j - 1] !== '\\') inStr = false
                } else if (ch === '"' || ch === "'") {
                    inStr = true; strCh = ch
                } else if (ch === '(') depth++
                else if (ch === ')') {
                    depth--
                    if (depth === 0) {
                        const argsStr = remainingPart.substring(start, j + 1)
                        modArgs = this.parseComplexArguments(argsStr)
                        i = j + 1
                        break
                    }
                }
            }
        }
        if (!this.ALLOWED_MODIFIERS.includes(modName as any)) {
            throw new Error(`Unsupported modifier: ${modName}`)
        }
        modifiers.push({ name: modName, args: modArgs })
        if (i < remainingPart.length && remainingPart[i] === '.') i++
    }
    
    return { modifiers, hasModifiers: modifiers.length > 0 }
}

Then each extractor would call:

const result = this.parseModifiersFromString(remainingPart)
return { [typePart]: ..., ...result }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93b8b3e and 1787e2a.

📒 Files selected for processing (1)
  • packages/components/src/secureZodParser.ts (9 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: build (ubuntu-latest, 18.15.0)
🔇 Additional comments (9)
packages/components/src/secureZodParser.ts (9)

7-10: LGTM! Base types and modifiers properly separated.

The separation of ALLOWED_BASE_TYPES and ALLOWED_MODIFIERS addresses the previous review feedback and correctly distinguishes between base primitives and chainable modifiers. The special cases (object, array, union, literal) are appropriately handled elsewhere.


50-62: LGTM! Top-level array and union support enabled.

The early routing for z.array() and z.union() correctly enables the top-level array and union schemas mentioned in the PR objectives. The updated error message accurately reflects the expanded capabilities.


284-322: LGTM! Union parsing implementation is sound.

The parseUnion method correctly:

  • Validates the z.union([...]) syntax
  • Enforces the minimum 2-member requirement
  • Recursively parses and validates each union member
  • Supports modifiers on the union type

324-370: LGTM! Literal parsing implementation is comprehensive.

The parseLiteral method correctly handles string (both quote styles), boolean, integer, and float literals. The parsing logic is sound and properly supports modifiers on literal types.


481-559: LGTM! Union modifier extraction uses proper stateful parsing.

The stateful parsing correctly handles nested structures in union modifiers, addressing previous review feedback. The logic for tracking bracket and parenthesis depth to find the union end is sound.

Note: The same escape handling edge case mentioned for extractLiteralWithModifiers applies here at line 534.


576-587: LGTM! Union and literal validation properly integrated.

The validation logic correctly handles the new types:

  • Unions are recursively validated for each member
  • Literals require no further validation beyond parsing

This is consistent with the existing validation patterns for nested objects and arrays.


845-859: LGTM! Top-level schema routing handles new types correctly.

The updated buildZodSchema now correctly routes non-object top-level schemas (arrays, unions, literals) to buildZodType, while maintaining the original behavior for object schemas. This enables the use cases described in the PR objectives.


900-935: LGTM! Union and literal building logic is correct.

The building logic for the new types is sound:

  • Union creation uses proper tuple typing [ZodTypeAny, ZodTypeAny, ...ZodTypeAny[]] required by Zod's API
  • Union members are recursively built and validated for the 2-member minimum
  • Literal building is straightforward with z.literal(value)
  • Both correctly apply modifiers via the unified applyModifiers mechanism

971-1019: LGTM! Unified modifier application is robust.

The applyModifiers method correctly applies modifiers across all supported types with appropriate type guards (e.g., .int() only on ZodNumber). The implementation handles edge cases like max/min on both strings and arrays.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/components/src/secureZodParser.ts (2)

103-103: Consider fixing escape handling in splitProperties for consistency.

This method also uses the old escape check pattern content[i - 1] !== '\\' at line 103. For consistency with the fixes in extractLiteralWithModifiers and extractUnionWithModifiers, consider applying the same backslash-counting approach here.


654-654: Consider fixing escape handling in parseArrayContent for consistency.

This method also uses the old escape check pattern content[i - 1] !== '\\' at line 654. For consistency with the fixes in extractLiteralWithModifiers and extractUnionWithModifiers, consider applying the same backslash-counting approach here.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1787e2a and 8ae1691.

📒 Files selected for processing (1)
  • packages/components/src/secureZodParser.ts (9 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build (ubuntu-latest, 18.15.0)
  • GitHub Check: build
🔇 Additional comments (10)
packages/components/src/secureZodParser.ts (10)

7-10: Excellent separation of base types and modifiers!

This addresses the previous review feedback perfectly. The separation clarifies that:

  • Base types are primitives handled by the buildZodType switch
  • Modifiers are safe chaining methods
  • Special types (object, array, union, literal) are handled via dedicated parsing paths with flags

50-62: Top-level array and union support looks good.

This correctly restores the advanced parsing capabilities mentioned in the PR objectives, allowing schemas like:

  • z.array(z.union([...]))
  • z.union([z.object({...}), z.object({...})])

The routing through parseZodType ensures consistent validation and building.


284-322: Union parsing implementation is solid.

The method correctly:

  • Extracts and handles modifiers
  • Validates the minimum member count (2) required by Zod
  • Recursively parses and validates each member
  • Preserves modifiers for later application

324-370: Literal parsing handles common cases correctly.

The implementation correctly handles:

  • String literals (both quote styles)
  • Boolean literals
  • Integer and float literals with proper sign handling

Note: Scientific notation (e.g., 1e10, 1.5e-3) is not supported, but this is acceptable for the current scope as it's a relatively uncommon use case in schema definitions.


372-448: Good implementation with improved escape handling.

The method correctly addresses the previous review feedback by counting consecutive backslashes to determine if a quote is escaped. This handles edge cases like "test\\" correctly.


487-571: Union modifier extraction is well implemented.

The method correctly:

  • Tracks nested bracket/parenthesis depth to find the union boundary
  • Implements the improved escape handling with backslash counting (lines 540-546)
  • Validates modifiers against the allowed list
  • Handles complex nested arguments

588-599: Validation logic for unions and literals is correct.

The implementation properly:

  • Recursively validates each union member
  • Recognizes that literal values are self-contained and don't require type validation
  • Returns early to avoid unnecessary checks

602-609: Validation checks correctly updated.

These changes properly use the new separated constants (ALLOWED_BASE_TYPES and ALLOWED_MODIFIERS) introduced at the top of the class.


857-871: Top-level schema handling is correct.

The method now properly routes non-object top-level schemas (arrays, unions, literals) to buildZodType while preserving the original object-building logic. This enables the use cases from the PR objectives.


912-947: Union and literal building is correctly implemented.

The implementation:

  • Properly constructs unions with the required tuple typing for TypeScript/Zod
  • Validates the minimum member count (consistent with parseUnion)
  • Correctly builds literal types with their values
  • Applies modifiers to both unions and literals (e.g., .optional(), .describe())

Comment on lines +450 to +485
private static splitUnionMembers(content: string): string[] {
const members: string[] = []
let current = ''
let depth = 0
let inString = false
let stringChar = ''

for (let i = 0; i < content.length; i++) {
const char = content[i]

if (!inString && (char === '"' || char === "'")) {
inString = true
stringChar = char
} else if (inString && char === stringChar && content[i - 1] !== '\\') {
inString = false
} else if (!inString) {
if (char === '(' || char === '[' || char === '{') {
depth++
} else if (char === ')' || char === ']' || char === '}') {
depth--
} else if (char === ',' && depth === 0) {
members.push(current.trim())
current = ''
continue
}
}

current += char
}

if (current.trim()) {
members.push(current.trim())
}

return members
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix escape handling inconsistency.

Line 463 uses the old escape check pattern content[i - 1] !== '\\' that was flagged in the previous review. This doesn't handle escaped backslashes correctly (e.g., strings ending with \\"). While this pattern was fixed in extractLiteralWithModifiers (lines 417-423) and extractUnionWithModifiers (lines 540-546), it remains inconsistent here.

Apply this fix to match the improved escape handling:

             } else if (inString && char === stringChar && content[i - 1] !== '\\') {
-                inString = false
+                // Count consecutive backslashes before the quote
+                let backslashCount = 0
+                for (let k = i - 1; k >= 0 && content[k] === '\\'; k--) {
+                    backslashCount++
+                }
+                // If even number of backslashes (including 0), the quote is not escaped
+                if (backslashCount % 2 === 0) {
+                    inString = false
+                }

Note: The same fix should be applied to splitProperties (line 103) and parseArrayContent (line 654) for consistency. Based on learnings

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In packages/components/src/secureZodParser.ts around lines 450 to 485, the
string escape check in splitUnionMembers uses content[i - 1] !== '\\' which
fails for sequences like \\\"; update the escape handling to mirror the fixed
pattern used elsewhere: compute the number of consecutive backslashes
immediately preceding the current quote and treat the quote as escaped if that
count is odd (e.g., count backward from i-1 while chars are '\\', then if count
% 2 === 1 consider it escaped). Replace the simple previous-char check with this
parity-based check, and apply the same change to splitProperties (around line
103) and parseArrayContent (around line 654) for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants