Skip to content

Conversation

@kdy1
Copy link
Member

@kdy1 kdy1 commented Nov 19, 2025

No description provided.

@kdy1 kdy1 added this to the Planned milestone Nov 19, 2025
@kdy1 kdy1 self-assigned this Nov 19, 2025
Copilot AI review requested due to automatic review settings November 19, 2025 16:11
@kdy1 kdy1 requested review from a team as code owners November 19, 2025 16:11
@changeset-bot
Copy link

changeset-bot bot commented Nov 19, 2025

⚠️ No Changeset found

Latest commit: 68be702

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review for PR #11307

Summary

This PR refactors the regexp compatibility pass by merging it into the Transformer architecture. The overall approach is sound and aligns with performance optimization goals, but there is one critical bug that needs to be fixed.


🐛 Critical Bug

Location: crates/swc_ecma_compat_common/src/regexp.rs:14

unicode_sets_regex: config.unicode_regex,  // ❌ BUG: Should be config.unicode_sets_regex

This line incorrectly assigns config.unicode_regex to unicode_sets_regex. It should be:

unicode_sets_regex: config.unicode_sets_regex,

Impact: The unicode_sets_regex configuration option will not work correctly - it will always mirror the unicode_regex setting instead of using its own value. This means the v flag (unicode sets) will be incorrectly handled based on the u flag configuration.


✅ Code Quality & Best Practices

Positive aspects:

  1. Performance-focused refactoring - Merging passes reduces AST traversals, aligning with CLAUDE.md guideline ecmascript parser #1
  2. Good code organization - The new architecture cleanly separates concerns between swc_ecma_compat_common (public API) and swc_ecma_transformer (implementation)
  3. Documentation preserved - All TC39 proposal links and feature documentation remain intact
  4. Logic preservation - The transformation logic is correctly moved from VisitMut to VisitMutHook::exit_expr

Architecture improvements:

  • Using the hook-based system allows better composition with other transformations
  • The hook() function returning Option allows early bailout when no transformations are needed (line 29-42 in regexp.rs)
  • Clean integration through the builder pattern in lib.rs:45

🔍 Code Analysis

Dependency changes:

  • Adding swc_ecma_transformer as a dependency to swc_ecma_compat_common creates the right direction of dependencies
  • The circular dependency is avoided since swc_ecma_transformer uses swc_ecma_compat_common indirectly

Implementation details:

  1. Line 58-59 in new regexp.rs: The parenthesization logic for lookbehind is correct but could be more explicit
  2. The wildcard imports (use swc_ecma_ast::*;) follow the existing codebase pattern
  3. Using Take trait correctly handles ownership transfer (line 63)

🧪 Test Coverage

Concern: No test modifications or additions were made in this PR.

Recommendation:

  • Verify that existing integration tests cover the regexp transformation
  • Add a specific test case for unicode_sets_regex to prevent regressions (especially given the bug found)
  • Ensure tests validate that all 8 configuration flags work independently

🔒 Security Considerations

No security concerns identified. The transformation logic remains the same, only the architecture changed.


📊 Performance Considerations

Positive:

  • Reduces the number of AST passes by merging regexp into the main transformer
  • The early return optimization in hook() (lines 30-42) prevents unnecessary work
  • Uses chain_optional() which allows skipping the entire hook when not needed

Note: The actual performance benefit should be measurable in the benchmarks. Consider documenting the performance improvement in the PR description once benchmarks complete.


🎯 Action Items

  1. MUST FIX: Correct line 14 in crates/swc_ecma_compat_common/src/regexp.rs
  2. RECOMMENDED: Add test coverage for unicode_sets_regex flag specifically
  3. OPTIONAL: Consider adding a comment explaining why exit_expr is used instead of a regular visit method

Compliance with CLAUDE.md

  • ✅ Guideline 1: Performance-focused (merging passes)
  • ✅ Guideline 2: English comments and documentation
  • ✅ Guideline 3: No nightly features used
  • ✅ Guideline 7: Documentation preserved
  • ⚠️ Guideline 5: No new unit tests added
  • ✅ Guideline 8: cargo fmt appears to have been run (formatting is consistent)

Overall Assessment: Good refactoring with solid architectural improvements, but must fix the copy-paste bug before merging. 👍

Copilot finished reviewing on behalf of kdy1 November 19, 2025 16:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the regexp compatibility transformation by merging it into the Transformer architecture. The transformation logic is moved from swc_ecma_compat_common to swc_ecma_transformer and converted from a VisitMut pattern to a VisitMutHook pattern for better integration with the transformer pipeline.

  • Moved regexp pass implementation from swc_ecma_compat_common to swc_ecma_transformer
  • Converted the implementation from VisitMut to VisitMutHook for hook-based traversal
  • Added into_pass() method to Options for creating transformer passes

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/swc_ecma_transformer/src/regexp.rs Implements the regexp transformation logic using VisitMutHook pattern with exit_expr hook
crates/swc_ecma_transformer/src/lib.rs Adds into_pass() method to Options for creating transformer passes
crates/swc_ecma_compat_common/src/regexp.rs Updated to delegate to swc_ecma_transformer by constructing Options and calling into_pass()
crates/swc_ecma_transformer/Cargo.toml Added swc_common dependency and reordered dependencies alphabetically
crates/swc_ecma_compat_common/Cargo.toml Added swc_ecma_transformer dependency and reordered dependencies alphabetically
Cargo.lock Updated lock file to reflect new dependency relationships

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +58 to +60
|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
|| regex.exp.contains("(?<!"))
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logical operator precedence issue causes regex.exp.contains("(?<!") to always trigger transformation regardless of the lookbehind_assertion option. The condition should be:

|| (self.options.lookbehind_assertion && (regex.exp.contains("(?<=") || regex.exp.contains("(?<!")))

This ensures both lookbehind patterns are only transformed when the option is enabled.

Suggested change
|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
|| regex.exp.contains("(?<!"))
|| (self.options.lookbehind_assertion && (regex.exp.contains("(?<=") || regex.exp.contains("(?<!")))

Copilot uses AI. Check for mistakes.
sticky_regex: config.sticky_regex,
unicode_property_regex: config.unicode_property_regex,
unicode_regex: config.unicode_regex,
unicode_sets_regex: config.unicode_regex,
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unicode_sets_regex field is being assigned the value of config.unicode_regex instead of config.unicode_sets_regex. This is a copy-paste error that will prevent the unicode sets regex transformation from working correctly when config.unicode_sets_regex is true.

Suggested change
unicode_sets_regex: config.unicode_regex,
unicode_sets_regex: config.unicode_sets_regex,

Copilot uses AI. Check for mistakes.
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 19, 2025

CodSpeed Performance Report

Merging #11307 will not alter performance

Comparing kdy1/merge-regex (68be702) with main (45827fa)

Summary

✅ 138 untouched

@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review

Overview

This PR consolidates the regexp transformation pass into the new crate, moving away from the standalone visitor pattern to the hook-based architecture. This is a performance-oriented refactoring that reduces the number of AST traversals.

✅ Strengths

  1. Performance Improvement: Merging the regexp pass into the unified transformer reduces AST traversals, which aligns with the project's performance-first principle (CLAUDE.md ecmascript parser #1).

  2. Code Consolidation: The implementation correctly migrates logic from to using the hook pattern.

  3. Backward Compatibility: The API in is preserved by wrapping the new transformer, maintaining compatibility for existing users.

  4. Proper Hook Integration: The regexp pass is correctly integrated into the hook chain via chain_optional in lib.rs:45, ensuring it only runs when needed.

🔍 Observations

  1. Logic Duplication: The condition check logic appears duplicated:

    • In regexp.rs:29-38: Checks if any option is enabled to return Some(RegexpPass)
    • In regexp.rs:52-61: Checks individual conditions to determine if transformation is needed

    This is fine for correctness, but the outer check in the hook() function prevents unnecessary hook instantiation when no regexp features need transformation.

  2. Operator Precedence: In regexp.rs:58-59, the condition has potential precedence issues:

    || (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
        || regex.exp.contains("(?<\!"))

    The lookbehind_assertion flag only guards the first check. The second || regex.exp.contains("(?<\!") will always be evaluated regardless of the flag. This appears to be a bug - both checks should be guarded by the flag:

    || (self.options.lookbehind_assertion && 
        (regex.exp.contains("(?<=") || regex.exp.contains("(?<\!")))
  3. Copy Assignment in preset_env: In preset_env/src/lib.rs:118, the options are set via direct assignment:

    options.env.regexp = RegExpOptions { ... };

    This is clean and straightforward.

  4. Dependency Changes: The PR correctly updates Cargo.toml files to reflect the new dependency structure.

⚠️ Potential Issues

Bug: Lookbehind Assertion Logic (regexp.rs:58-59)

The current code:

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<\!")))

Due to operator precedence, this is equivalent to:

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<="))
|| regex.exp.contains("(?<\!")

This means negative lookbehind (?<\! will always trigger transformation regardless of the lookbehind_assertion flag. The fix should group both checks:

|| (self.options.lookbehind_assertion && 
    (regex.exp.contains("(?<=") || regex.exp.contains("(?<\!")))

📝 Minor Notes

  1. Comments: The code would benefit from a module-level doc comment in regexp.rs explaining the transformation purpose and which regexp features are transformed (CLAUDE.md Clean up #2, Travis and codecov #7).

  2. Test Coverage: I couldn't locate specific unit tests for this regexp transformation in the transformer crate. While integration tests may exist elsewhere, consider adding unit tests per CLAUDE.md Handle early errors in ecmascript parser #5.

  3. Formatting: Remember to run cargo fmt --all before committing (CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8).

🎯 Recommendation

Please fix the operator precedence bug in the lookbehind assertion check before merging. The refactoring approach is sound and improves performance, but the logic bug could cause incorrect transformations.


Summary: Good performance-oriented refactoring with one logic bug that needs fixing. The architectural change to consolidate passes is the right direction.

@kdy1 kdy1 enabled auto-merge (squash) November 19, 2025 16:17
@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review: PR #11307 - perf(es/compat): Merge regexp pass into Transformer

Summary

This PR refactors the regexp transformation pass by merging it into the new swc_ecma_transformer infrastructure, moving from a standalone VisitMut implementation to a hook-based approach. This is a performance-focused refactoring that consolidates passes.

Critical Bug Found 🐛

Location: crates/swc_ecma_compat_common/src/regexp.rs:14

unicode_sets_regex: config.unicode_regex,  // ❌ BUG: Should be config.unicode_sets_regex

Issue: The unicode_sets_regex field is incorrectly assigned config.unicode_regex instead of config.unicode_sets_regex. This will cause the unicode sets regex feature to be enabled/disabled incorrectly based on the wrong configuration value.

Fix Required:

unicode_sets_regex: config.unicode_sets_regex,

Code Quality

Positive Aspects ✅

  1. Good architectural direction: Consolidating passes into the transformer infrastructure should improve performance by reducing AST traversals
  2. Clean dependency management: Properly updated Cargo.toml files with correct version dependencies
  3. Code consolidation: Removes 91 lines and adds 88 lines, resulting in a net reduction while maintaining functionality
  4. Logic preservation: The core regex transformation logic remains identical (just moved)
  5. Consistent structure: Follows the established pattern in the transformer crate

Areas of Concern ⚠️

  1. Copy-paste error: The critical bug mentioned above appears to be a copy-paste error when transferring configuration
  2. Missing test validation: No evidence of tests being run to catch the configuration bug
  3. Commit messages: Commit messages like "Dep", "regexp", "Drop", "lockfile" are not descriptive. Per CLAUDE.md guidelines, documentation should explain the "why" not just the "what"

Performance Considerations

Positive: This refactoring should improve performance by:

  • Reducing the number of separate AST traversals
  • Enabling better optimization opportunities through the unified transformer infrastructure
  • Following the project's performance-first directive (CLAUDE.md point 1)

No obvious performance regressions were introduced by the code changes themselves (aside from the bug).

Security Concerns

No security issues identified. The transformation logic remains the same and doesn't introduce new attack vectors.

Test Coverage ⚠️

Concern: The PR doesn't show evidence of:

  1. Running existing tests to validate the refactoring
  2. The configuration bug suggests tests may not have been run, or test coverage is insufficient
  3. Per CLAUDE.md point 5: "Write unit tests for your code" - though this is a refactoring, validation is crucial

Recommendation:

  • Run the full test suite, especially regexp-related tests
  • Verify tests cover all regex feature flags including unicode_sets_regex
  • Consider adding a test that explicitly validates configuration mapping

Additional Observations

  1. Documentation: Per CLAUDE.md point 7, the code could benefit from doc comments explaining the refactoring and the relationship between the old and new implementation
  2. Formatting: Remember to run cargo fmt --all before committing (CLAUDE.md point 8)

Recommended Actions

  1. Critical: Fix the unicode_sets_regex bug in crates/swc_ecma_compat_common/src/regexp.rs:14
  2. Run cargo test to validate all regexp transformations still work correctly
  3. Run cargo fmt --all
  4. Consider improving commit messages for better git history

Conclusion

This is a solid performance-oriented refactoring with good architectural direction, but it must not be merged until the critical unicode_sets_regex configuration bug is fixed. Once fixed and tests pass, this should be a positive change for the codebase.


Review performed following CLAUDE.md guidelines with focus on performance, code quality, and correctness.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 19, 2025

Binary Sizes

File Size
swc.linux-x64-gnu.node 31M (31934280 bytes)

Commit: f4ebc8a

Copilot AI review requested due to automatic review settings November 19, 2025 16:33
Copilot finished reviewing on behalf of kdy1 November 19, 2025 16:35
@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review: PR #11307 - perf(es/compat): Merge regexp pass into Transformer

Summary

This PR refactors the regex compatibility transform by consolidating the standalone regexp pass into the new unified Transformer architecture. This is a performance optimization that reduces the number of AST traversals by merging multiple passes into a single traversal using the hook-based system.


✅ Strengths

  1. Performance Improvement: The main goal of reducing AST traversals is well-executed. By integrating the regexp pass into the Transformer hook system, you eliminate a separate pass.

  2. Code Consolidation: The refactor successfully moves the regexp logic from swc_ecma_compat_common to swc_ecma_transformer, centralizing the transformation logic.

  3. Clean Architecture: The use of the OptionalHook pattern and HookBuilder provides a clean way to compose transformations, and the regexp hook correctly returns None when no transformations are needed.

  4. Backward Compatibility: The public API in swc_ecma_compat_common/src/regexp.rs is preserved, ensuring existing users aren't broken.


🔍 Issues & Concerns

1. Potential Bug in unicode_sets_regex mapping (Line 14 in regexp.rs)

unicode_sets_regex: config.unicode_regex,  // ⚠️ Should be config.unicode_sets_regex

In crates/swc_ecma_compat_common/src/regexp.rs:14, you're assigning config.unicode_regex to unicode_sets_regex instead of config.unicode_sets_regex. This looks like a copy-paste error and would cause the wrong flag to be checked.

Fix:

unicode_sets_regex: config.unicode_sets_regex,

2. Dependency Cycle Risk

The changes introduce a new dependency: swc_ecma_compat_common now depends on swc_ecma_transformer. While this works for the refactoring, it's worth noting that this creates a layering where:

  • swc_ecma_transformer depends on swc_ecma_transforms_base
  • swc_ecma_compat_common now depends on swc_ecma_transformer

This is fine for now, but be mindful of potential circular dependencies if swc_ecma_transformer ever needs to depend on compat modules.

3. Missing Documentation on Performance Impact

While the PR title mentions "perf", there's no documentation or benchmark data showing the actual performance improvement. Consider adding:

  • Benchmark results comparing before/after
  • Information about how many passes are eliminated
  • Impact on compilation time

4. Hook Implementation Uses exit_expr Only

The regexp pass only implements exit_expr, which is correct for this use case, but it might be worth documenting why you chose exit_expr over enter_expr (i.e., because you need to transform leaf nodes after visiting children).


📝 Code Quality & Best Practices

Adherence to CLAUDE.md Guidelines:

  • Performance: Changes prioritize performance by reducing passes
  • English comments: Documentation is in English
  • No unstable features: Uses stable Rust
  • ⚠️ Unit tests: No new tests added (see below)
  • ⚠️ Formatting: Need to verify cargo fmt --all was run

Test Coverage Concerns:

The PR doesn't add any new tests. While the transformation logic is essentially moved (not rewritten), it would be good to have:

  1. Integration tests verifying the new path works correctly
  2. Tests ensuring the swc_ecma_compat_common::regexp() wrapper still functions
  3. Regression tests for the unicode_sets_regex bug mentioned above

🔐 Security Concerns

No security issues identified. The transformation logic remains the same, just reorganized.


🎯 Recommendations

High Priority:

  1. Fix the unicode_sets_regex bug in crates/swc_ecma_compat_common/src/regexp.rs:14
  2. Add test coverage to verify the refactored code works correctly
  3. Run cargo fmt --all before merging

Medium Priority:
4. Add benchmarks or performance metrics to the PR description
5. Consider adding a comment explaining why exit_expr is used instead of enter_expr

Low Priority:
6. Document the architectural decision to have swc_ecma_compat_common depend on swc_ecma_transformer


✨ Overall Assessment

This is a solid refactoring that aligns with the project's performance goals. The hook-based architecture is clean and extensible. However, the unicode_sets_regex bug needs to be fixed before merging, and test coverage should be added to prevent regressions.

The changes demonstrate good understanding of the codebase architecture and follow Rust best practices. Once the bug is fixed and tests are added, this will be a valuable performance improvement.

Status: ⚠️ Needs changes (primarily the unicode_sets_regex bug fix)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kdy1 kdy1 disabled auto-merge November 19, 2025 16:36
@kdy1 kdy1 changed the title perf(es/compat): Merge regexp pass into Transfomer perf(es/compat): Merge regexp pass into Transformer Nov 19, 2025
@kdy1 kdy1 enabled auto-merge (squash) November 19, 2025 16:36
@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review: Performance Optimization - Merge regexp pass into Transformer

Summary

This PR refactors the regexp transformation pass by integrating it into the new unified swc_ecma_transformer infrastructure. This consolidation should improve performance by reducing AST traversals.


✅ Positive Observations

  1. Performance-focused: The refactoring aligns well with SWC's performance-first philosophy (CLAUDE.md ecmascript parser #1) by merging separate passes into a unified transformer.

  2. Good abstraction: The RegexpPass hook implementation using VisitMutHook follows a clean, reusable pattern.

  3. Maintained functionality: The regexp transformation logic remains functionally equivalent to the original implementation.

  4. Proper dependency management: Dependencies were correctly updated in Cargo.toml files.


📝 Code Quality Observations

1. Test Output Change (crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:6)

-var c = RegExp(".", "imsuy");
+var c = new RegExp(".", "imsuy");

Question: Is this intentional? The transformation now uses new RegExp() instead of RegExp(). While functionally equivalent in most cases, this changes the output. Consider:

  • Was this change expected as part of the refactoring?
  • Should the implementation maintain the exact previous output for consistency?

In regexp.rs:73-78, the code generates new RegExp(...) via CallExpr. If the previous behavior (without new) was intentional, you might need to use NewExpr conditionally.

2. Potential Performance Issue in Condition Ordering (regexp.rs:58-59)

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<\!"))

The condition on line 59 (regex.exp.contains("(?<\!")) is always evaluated regardless of the lookbehind_assertion flag due to operator precedence. This should be:

|| (self.options.lookbehind_assertion 
    && (regex.exp.contains("(?<=") || regex.exp.contains("(?<\!")))

Impact: Minor performance issue - unnecessary string search when feature is disabled. This also existed in the original code, but worth fixing.

3. OptionalHook Implementation (hook_utils.rs:10-44)

The macro-based implementation is clean, but consider these points:

  • Good: DRY approach using macros
  • Documentation: The macro could benefit from a comment explaining its purpose
  • Completeness: The TODO comment at line 30 mentions implementing more hooks. Is this tracked somewhere?

4. Missing new Keyword Explanation (regexp.rs:73-79)

The CallExpr creates a call expression but the test shows it generates new RegExp(). This seems inconsistent with the code. Looking at the diff more carefully, only line 6 of the test changed. Lines 4-5 show RegExp(...) without new.

Actually, looking at the actual code and test output:

  • Line 4: RegExp("...", "u") - no new
  • Line 5: RegExp(".", "s") - no new
  • Line 6: new RegExp(".", "imsuy") - has new

This is confusing. The code at regexp.rs:73-79 creates a CallExpr, not a NewExpr. Yet the test output shows new. Need clarification on whether:

  • The test reference file needs correction
  • The code should generate NewExpr instead
  • Something else is adding the new keyword

5. Code Duplication

The RegExpOptions struct definition appears in both:

  • swc_ecma_transformer/src/regexp.rs:10-27
  • swc_ecma_compat_common/src/regexp.rs:20-38

While this might be intentional for API boundary reasons, consider whether the compat layer could re-export from transformer to maintain a single source of truth.


🔒 Security Considerations

No security concerns identified. The transformation logic remains the same, and regex handling doesn't introduce new attack vectors.


🧪 Test Coverage

  • ✅ Existing tests were updated appropriately
  • Recommendation: Add a specific test for the swc_ecma_transformer::regexp module to ensure the hook works independently. Currently, coverage relies on preset-env integration tests.

📋 Recommendations

  1. High Priority: Clarify the new RegExp vs RegExp output change and ensure it's intentional.

  2. Medium Priority: Fix the operator precedence issue in the lookbehind assertion condition (regexp.rs:58-59).

  3. Low Priority:

    • Add inline documentation for the optional_method\! macro
    • Consider unit tests directly in swc_ecma_transformer
    • Evaluate if RegExpOptions duplication can be reduced
  4. Before Merge: Run cargo fmt --all as specified in CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8 (though the code appears already formatted).


Overall Assessment

Verdict: ✅ Approve with Minor Fixes

The refactoring achieves its performance goal and maintains code quality. The main concerns are:

  1. The test output change needs verification
  2. The operator precedence bug should be fixed

Great work on the performance optimization! This pattern will make future optimizations easier.

Copilot AI review requested due to automatic review settings November 19, 2025 16:47
Copilot finished reviewing on behalf of kdy1 November 19, 2025 16:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +59 to +60
|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
|| regex.exp.contains("(?<!"))
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operator precedence issue: the condition for lookbehind_assertion is incorrectly grouped. Line 60's || regex.exp.contains(\"(?<!\") is not checked against self.options.lookbehind_assertion, so negative lookbehind patterns (?<!) will always be transformed regardless of the option setting. Add parentheses around the entire lookbehind check: || (self.options.lookbehind_assertion && (regex.exp.contains(\"(?<=\") || regex.exp.contains(\"(?<!\")))

Suggested change
|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
|| regex.exp.contains("(?<!"))
|| (self.options.lookbehind_assertion && (regex.exp.contains("(?<=") || regex.exp.contains("(?<!")))

Copilot uses AI. Check for mistakes.
@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review

Thank you for this performance optimization! This PR merges the pass into the to improve performance. Overall, the refactoring looks good, but there are several issues that need to be addressed before merging.


🐛 Critical Issues

1. Compilation Error: Non-exhaustive Struct

Location: crates/swc_ecma_compat_common/src/regexp.rs:6-15

The code fails to compile because RegExpOptions is marked as #[non_exhaustive] in swc_ecma_transformer/src/regexp.rs:10, which prevents creating it using struct expression syntax from outside the crate.

error[E0639]: cannot create non-exhaustive struct using struct expression
  --> crates/swc_ecma_compat_common/src/regexp.rs:6:26

Fix: Either:

  • Remove #[non_exhaustive] from RegExpOptions if backward compatibility isn't a concern, OR
  • Add a constructor function like RegExpOptions::new(...) or builder pattern

2. Operator Precedence Bug in Lookbehind Assertion Check

Location: crates/swc_ecma_transformer/src/regexp.rs:59-60

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<\!"))

Due to operator precedence, this is parsed as:

(self.options.lookbehind_assertion && regex.exp.contains("(?<=")) || regex.exp.contains("(?<\!")

This means negative lookbehind (?<\!) will always trigger transformation, even when lookbehind_assertion is false.

Fix: Add parentheses:

|| (self.options.lookbehind_assertion && (regex.exp.contains("(?<=") || regex.exp.contains("(?<\!")))

📝 Code Quality Issues

3. Missing Unit Tests

Observation: The PR doesn't add unit tests for the new regexp.rs module in swc_ecma_transformer.

Per CLAUDE.md instruction #5: "Write unit tests for your code."

Recommendation: Add unit tests to verify:

  • Each regexp flag is correctly transformed
  • The lookbehind assertion logic works correctly
  • Edge cases (empty flags, multiple flags combined)
  • The optimization hook is only created when at least one option is enabled

4. Test Output Change

Location: crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:6

The test expectation changed from:

var c = RegExp(".", "imsuy");

to:

var c = new RegExp(".", "imsuy");

Question: Is this intentional? Both are semantically equivalent, but the change should be explained. If this is a side effect of moving code, consider whether consistency with previous output is important.


✅ Positive Aspects

  1. Performance Optimization: Merging the pass into the transformer is a good optimization strategy
  2. Code Consolidation: Removes duplicate code between swc_ecma_compat_common and the new transformer
  3. Cleaner Architecture: The hook-based approach is more modular
  4. Documentation: Good inline comments explaining what each flag does

🔍 Performance Considerations

The hook system adds a slight indirection through OptionalHook, but this should be negligible compared to the benefits of merging passes. The conditional check in hook() at regexp.rs:31-43 ensures no performance penalty when regexp transformations aren't needed.


🛡️ Security Concerns

No security issues identified. The string matching for regexp patterns (contains) is safe.


📋 Recommendations

  1. Fix the compilation error (blocking)
  2. Fix the operator precedence bug (blocking)
  3. Add unit tests per project guidelines
  4. Run cargo fmt --all before committing (per CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8)
  5. Verify the test output change is intentional
  6. Consider adding documentation for the new Options::into_pass() method

Summary

This is a good performance optimization, but it cannot be merged due to the compilation error. After fixing the critical issues and adding tests, this should be ready to go.

@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review

Thank you for this performance optimization! This PR merges the regexp pass into the Transformer to improve performance. Overall, the refactoring looks good, but there are several issues that need to be addressed before merging.


🐛 Critical Issues

1. Compilation Error: Non-exhaustive Struct

Location: crates/swc_ecma_compat_common/src/regexp.rs:6-15

The code fails to compile because RegExpOptions is marked as #[non_exhaustive] in swc_ecma_transformer/src/regexp.rs:10, which prevents creating it using struct expression syntax from outside the crate.

error[E0639]: cannot create non-exhaustive struct using struct expression
  --> crates/swc_ecma_compat_common/src/regexp.rs:6:26

Fix: Either:

  • Remove #[non_exhaustive] from RegExpOptions if backward compatibility isn't a concern, OR
  • Add a constructor function like RegExpOptions::new(...) or builder pattern

2. Operator Precedence Bug in Lookbehind Assertion Check

Location: crates/swc_ecma_transformer/src/regexp.rs:59-60

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<!"))

Due to operator precedence, this is parsed as:

(self.options.lookbehind_assertion && regex.exp.contains("(?<=")) || regex.exp.contains("(?<!")

This means negative lookbehind (?<!) will always trigger transformation, even when lookbehind_assertion is false.

Fix: Add parentheses:

|| (self.options.lookbehind_assertion && (regex.exp.contains("(?<=") || regex.exp.contains("(?<!")))

📝 Code Quality Issues

3. Missing Unit Tests

Observation: The PR doesn't add unit tests for the new regexp.rs module in swc_ecma_transformer.

Per CLAUDE.md instruction #5: "Write unit tests for your code."

Recommendation: Add unit tests to verify:

  • Each regexp flag is correctly transformed
  • The lookbehind assertion logic works correctly
  • Edge cases (empty flags, multiple flags combined)
  • The optimization hook is only created when at least one option is enabled

4. Test Output Change

Location: crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:6

The test expectation changed from:

var c = RegExp(".", "imsuy");

to:

var c = new RegExp(".", "imsuy");

Question: Is this intentional? Both are semantically equivalent, but the change should be explained. If this is a side effect of moving code, consider whether consistency with previous output is important.


✅ Positive Aspects

  1. Performance Optimization: Merging the pass into the transformer is a good optimization strategy
  2. Code Consolidation: Removes duplicate code between swc_ecma_compat_common and the new transformer
  3. Cleaner Architecture: The hook-based approach is more modular
  4. Documentation: Good inline comments explaining what each flag does

🔍 Performance Considerations

The hook system adds a slight indirection through OptionalHook, but this should be negligible compared to the benefits of merging passes. The conditional check in hook() at regexp.rs:31-43 ensures no performance penalty when regexp transformations aren't needed.


🛡️ Security Concerns

No security issues identified. The string matching for regexp patterns (contains) is safe.


📋 Recommendations

  1. Fix the compilation error (blocking)
  2. Fix the operator precedence bug (blocking)
  3. Add unit tests per project guidelines
  4. Run cargo fmt --all before committing (per CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8)
  5. Verify the test output change is intentional
  6. Consider adding documentation for the new Options::into_pass() method

Summary

This is a good performance optimization, but it cannot be merged due to the compilation error. After fixing the critical issues and adding tests, this should be ready to go.

@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review

Overview

This PR successfully merges the transformation pass into the new architecture using the hook system. This is a performance-oriented refactoring that consolidates transformation logic.

Positive Aspects

Architecture Improvement: The migration to the hook-based system is well-executed and aligns with the project's direction of consolidating transformation passes.

Code Organization: The regexp logic is cleanly moved to swc_ecma_transformer while maintaining a backwards-compatible wrapper in swc_ecma_compat_common.

Dependency Management: Properly updated Cargo.toml files to reflect the new architecture.

Test Coverage: Existing test updated to reflect the new output format.

Issues & Concerns

1. Logic Error in Lookbehind Assertion Check (crates/swc_ecma_transformer/src/regexp.rs:59-60)

The lookbehind assertion condition has a subtle logic bug:

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<\!"))

Problem: Due to operator precedence, regex.exp.contains("(?<\!") is always evaluated regardless of self.options.lookbehind_assertion. This means negative lookbehinds will be transformed even when the option is disabled.

Expected behavior:

|| (self.options.lookbehind_assertion 
    && (regex.exp.contains("(?<=") || regex.exp.contains("(?<\!")))

2. Inconsistent Test Output (crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:6)

The test shows an unexpected change:

  • Before: var c = RegExp(".", "imsuy");
  • After: var c = new RegExp(".", "imsuy");

Questions:

  • Is the addition of new keyword intentional?
  • Does this behavioral change affect runtime semantics? (It shouldn't in JavaScript, but worth documenting)
  • Are there other tests that validate this change is consistent?

The old code generated RegExp(...) constructor calls, while the new code generates new RegExp(...). While semantically equivalent for the RegExp constructor, this change should be explicitly documented or explained in the commit message.

3. Performance Consideration - String Contains Checks

The current implementation uses multiple .contains() checks on regex.exp:

regex.exp.contains("(?<")
regex.exp.contains("(?<=")
regex.exp.contains("(?<\!")
regex.exp.contains("\\p{")
regex.exp.contains("\\P{")

Per CLAUDE.md: "Write performant code. Always prefer performance over other things."

Suggestion: For hot paths, consider:

  1. Single-pass scanning instead of multiple contains() calls
  2. Compile-time regex patterns if the overhead justifies it
  3. Profile whether this is actually a hot path that needs optimization

However, this may be premature optimization - recommend measuring first.

4. Documentation Gap

Missing documentation for RegexpPass struct and its implementation. Per CLAUDE.md #7: "Write documentation for your code."

Suggestion: Add doc comments explaining:

/// Transform pass that converts modern RegExp syntax to RegExp constructor calls
/// for compatibility with older JavaScript environments.
///
/// This pass handles various RegExp features like dotAll, unicode, sticky flags,
/// and advanced features like lookbehind assertions and unicode property escapes.
struct RegexpPass {
    options: RegExpOptions,
}

Minor Issues

  1. Formatting: Ensure cargo fmt --all is run before committing (per CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8). The CI checks are pending, so this may already be done.

  2. Empty PR Description: The PR has no description explaining the motivation, changes, or performance impact. Consider adding context about:

    • Why merge into Transformer?
    • Expected performance improvement
    • Any breaking changes or behavioral differences

Security

✅ No security concerns identified. The transformation logic is straightforward AST manipulation.

Test Coverage

⚠️ Moderate: Only one test file was updated. Consider:

  • Test specifically for the lookbehind assertion logic bug mentioned above
  • Tests for edge cases (empty flags, multiple flags, escaped patterns)
  • Performance benchmarks to validate the "perf" claim in the PR title

Recommendations

Must Fix:

  1. Fix the lookbehind assertion logic error

Should Address:
2. Document or explain the new keyword change in test output
3. Add documentation for RegexpPass
4. Add PR description with context

Consider:
5. Performance profiling to validate the "perf" improvement claim
6. Additional test coverage for edge cases
7. Run cargo fmt --all if not already done

Overall Assessment

This is a well-structured refactoring that moves in the right direction architecturally. The main concern is the logic bug in the lookbehind assertion check that needs to be fixed. Once addressed, this should be good to merge.

The code quality is generally high and follows the project's conventions, with room for improvement in documentation and test coverage.

@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review

Overview

This PR successfully merges the regexp transformation pass into the new Transformer architecture using the hook system. This is a performance-oriented refactoring that consolidates transformation logic.

Positive Aspects

Architecture Improvement: The migration to the hook-based system is well-executed and aligns with the project's direction of consolidating transformation passes.

Code Organization: The regexp logic is cleanly moved to swc_ecma_transformer while maintaining a backwards-compatible wrapper in swc_ecma_compat_common.

Dependency Management: Properly updated Cargo.toml files to reflect the new architecture.

Test Coverage: Existing test updated to reflect the new output format.

Issues & Concerns

1. Logic Error in Lookbehind Assertion Check

Location: crates/swc_ecma_transformer/src/regexp.rs:59-60

The lookbehind assertion condition has a subtle logic bug:

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<!"))

Problem: Due to operator precedence, regex.exp.contains("(?<!") is always evaluated regardless of self.options.lookbehind_assertion. This means negative lookbehinds will be transformed even when the option is disabled.

Expected behavior:

|| (self.options.lookbehind_assertion 
    && (regex.exp.contains("(?<=") || regex.exp.contains("(?<!")))

2. Inconsistent Test Output

Location: crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:6

The test shows an unexpected change:

  • Before: var c = RegExp(".", "imsuy");
  • After: var c = new RegExp(".", "imsuy");

Questions:

  • Is the addition of new keyword intentional?
  • Does this behavioral change affect runtime semantics? (It shouldn't in JavaScript, but worth documenting)
  • Are there other tests that validate this change is consistent?

The old code generated RegExp(...) constructor calls, while the new code generates new RegExp(...). While semantically equivalent for the RegExp constructor, this change should be explicitly documented or explained in the commit message.

3. Performance Consideration - String Contains Checks

The current implementation uses multiple .contains() checks on regex.exp:

regex.exp.contains("(?<")
regex.exp.contains("(?<=")
regex.exp.contains("(?<!")
regex.exp.contains("\\p{")
regex.exp.contains("\\P{")

Per CLAUDE.md: "Write performant code. Always prefer performance over other things."

Suggestion: For hot paths, consider:

  1. Single-pass scanning instead of multiple contains() calls
  2. Compile-time regex patterns if the overhead justifies it
  3. Profile whether this is actually a hot path that needs optimization

However, this may be premature optimization - recommend measuring first.

4. Documentation Gap

Missing documentation for RegexpPass struct and its implementation. Per CLAUDE.md #7: "Write documentation for your code."

Suggestion: Add doc comments explaining:

/// Transform pass that converts modern RegExp syntax to RegExp constructor calls
/// for compatibility with older JavaScript environments.
///
/// This pass handles various RegExp features like dotAll, unicode, sticky flags,
/// and advanced features like lookbehind assertions and unicode property escapes.
struct RegexpPass {
    options: RegExpOptions,
}

Minor Issues

  1. Formatting: Ensure cargo fmt --all is run before committing (per CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8). The CI checks are pending, so this may already be done.

  2. Empty PR Description: The PR has no description explaining the motivation, changes, or performance impact. Consider adding context about:

    • Why merge into Transformer?
    • Expected performance improvement
    • Any breaking changes or behavioral differences

Security

✅ No security concerns identified. The transformation logic is straightforward AST manipulation.

Test Coverage

⚠️ Moderate: Only one test file was updated. Consider:

  • Test specifically for the lookbehind assertion logic bug mentioned above
  • Tests for edge cases (empty flags, multiple flags, escaped patterns)
  • Performance benchmarks to validate the "perf" claim in the PR title

Recommendations

Must Fix:

  1. Fix the lookbehind assertion logic error

Should Address:
2. Document or explain the new keyword change in test output
3. Add documentation for RegexpPass
4. Add PR description with context

Consider:
5. Performance profiling to validate the "perf" improvement claim
6. Additional test coverage for edge cases
7. Run cargo fmt --all if not already done

Overall Assessment

This is a well-structured refactoring that moves in the right direction architecturally. The main concern is the logic bug in the lookbehind assertion check that needs to be fixed. Once addressed, this should be good to merge.

The code quality is generally high and follows the project conventions, with room for improvement in documentation and test coverage.

Copilot AI review requested due to automatic review settings November 19, 2025 17:01
@claude
Copy link
Contributor

claude bot commented Nov 19, 2025

Code Review: Merge regexp pass into Transformer

Overview

This PR refactors the regex transformation pass by consolidating it into the unified swc_ecma_transformer architecture, replacing the standalone VisitMut implementation. This is a performance optimization that reduces AST traversals.

✅ Strengths

  1. Performance Improvement: The main goal of merging the regexp pass into the transformer hook system should reduce redundant AST traversals, which aligns with the project's performance-first principle (CLAUDE.md ecmascript parser #1).

  2. Architecture Consistency: The refactoring follows the same pattern as other ES version passes (es2020, es2022, etc.), making the codebase more consistent and maintainable.

  3. Clean Dependency Management: The changes properly update the dependency graph:

    • swc_ecma_compat_common now depends on swc_ecma_transformer instead of swc_ecma_visit
    • Removes 49 lines of duplicate logic from swc_ecma_compat_common/src/regexp.rs
  4. Backward Compatibility: The public API in swc_ecma_compat_common remains unchanged - the regexp(Config) function still returns impl Pass, maintaining compatibility for external consumers.

🔍 Issues & Concerns

1. Logic Duplication in Lookbehind Assertion Check (crates/swc_ecma_transformer/src/regexp.rs:59-60)

The condition on line 59-60 has a problematic structure:

|| (self.options.lookbehind_assertion && regex.exp.contains("(?<=")
    || regex.exp.contains("(?<\!"))

Issue: The second regex.exp.contains("(?<\!") is always executed regardless of the lookbehind_assertion flag, meaning negative lookbehinds will always trigger transformation even when lookbehind_assertion is false.

Fix: Add parentheses to ensure both conditions are checked only when the flag is enabled:

|| (self.options.lookbehind_assertion 
    && (regex.exp.contains("(?<=") || regex.exp.contains("(?<\!")))

This is a bug carry-over from the original implementation, but should be fixed.

2. String Contains Performance (crates/swc_ecma_transformer/src/regexp.rs:58-62)

The code uses multiple contains() calls on the same string, which could be optimized. While the performance impact is likely minimal, consider caching or combining checks for better performance in hot paths.

3. Missing Documentation (crates/swc_ecma_transformer/src/lib.rs:57-63)

The new into_pass() method lacks documentation. Per CLAUDE.md #7, all code should have documentation. Suggest adding:

/// Converts these options into a transformation pass.
///
/// This creates a hook-based transformer that applies all enabled
/// transformations in a single AST traversal.
pub fn into_pass(self) -> impl Pass {
    let hook = transform_hook(self);
    hook_pass(hook)
}

4. Test Coverage Change (crates/swc_ecma_preset_env/tests/fixtures/corejs3/usage-regexp/output.mjs:6)

The test output changed from:

var c = RegExp(".", "imsuy");

to:

var c = new RegExp(".", "imsuy");

Question: Is this intentional? Both are functionally equivalent, but the change should be explained. If this is intentional to add new for consistency, document it in the PR description.

5. Hook Implementation Completeness (crates/swc_ecma_transformer/src/hook_utils.rs:30)

The TODO comment indicates incomplete hook implementation. While this doesn't affect the current PR, it's worth noting that only a subset of AST nodes have hook methods implemented. This could limit future extensibility.

📋 Minor Issues

  1. Comment Formatting (crates/swc_ecma_transformer/src/regexp.rs:26): Missing /// for doc comment:

    // [RegExp.prototype.unicodeSets]...

    Should be:

    /// [RegExp.prototype.unicodeSets]...
    
  2. Consistent Formatting: The code follows proper formatting, but ensure cargo fmt --all is run before merging (CLAUDE.md Unignore "431ecef8c85d4d24.js" and "8386fbff927a9e0e.js". #8).

🔒 Security

No security concerns identified. The transformation logic remains the same, just refactored into a different architecture.

🧪 Testing

  • Existing test is updated and should pass
  • The change is primarily architectural, so existing test coverage should be sufficient
  • Consider adding a test case that specifically checks the lookbehind assertion bug mentioned above

📊 Performance Impact

Positive: This refactoring should improve performance by reducing the number of AST traversals. The hook-based system allows multiple transformations to run in a single pass rather than separate visitor passes.

Recommendations

  1. Critical: Fix the lookbehind assertion logic bug (line 59-60 in regexp.rs)
  2. High: Add documentation for the into_pass() method
  3. Medium: Clarify why the test output changed from RegExp(...) to new RegExp(...)
  4. Low: Fix the doc comment formatting for unicode_sets_regex

Overall, this is a solid refactoring that improves the architecture and performance. The main concern is the logic bug in the lookbehind assertion check that should be addressed before merging.

Copilot finished reviewing on behalf of kdy1 November 19, 2025 17:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

var a = RegExp("(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})", "u");
var b = RegExp(".", "s");
var c = RegExp(".", "imsuy");
var c = new RegExp(".", "imsuy");
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output shows inconsistent behavior where variable c is transformed to new RegExp(...) while variables a and b on lines 4-5 are transformed to RegExp(...) without the new keyword. Since the transformation code creates a CallExpr without new, this suggests an unintended change in behavior. Please verify that this test output is correct or investigate why the new keyword is being added for this specific case.

Suggested change
var c = new RegExp(".", "imsuy");
var c = RegExp(".", "imsuy");

Copilot uses AI. Check for mistakes.
@kdy1 kdy1 merged commit 440b391 into main Nov 19, 2025
191 of 192 checks passed
@kdy1 kdy1 deleted the kdy1/merge-regex branch November 19, 2025 17:30
@kdy1 kdy1 modified the milestones: Planned, 1.15.3 Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants