Skip to content

[FIX] Infinite Loop in Variablization of Subtrees#77

Merged
colinthebomb1 merged 2 commits intomainfrom
fix/infinite-loop-rule-generate
Sep 9, 2025
Merged

[FIX] Infinite Loop in Variablization of Subtrees#77
colinthebomb1 merged 2 commits intomainfrom
fix/infinite-loop-rule-generate

Conversation

@colinthebomb1
Copy link
Collaborator

Overview:

This PR fixes an infinite loop in variablize_subtrees(), where an already variablized subtree like {"value": "V22"} would be variablized again in an infinite loop, ex. {"value": "V22"} -> {"value": "V27"} -> {"value": "V32"} etc.

Testing:

  • Pass all existing tests under test
  • Added test_generate_general_rule_22 and test_rewrite_aggregation_to_subquery

@colinthebomb1 colinthebomb1 self-assigned this Sep 2, 2025
@baiqiushi baiqiushi requested a review from Copilot September 2, 2025 21:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an infinite loop bug in the variablize_subtrees() function where already variablized subtrees (containing variables like "V22") would be repeatedly variablized, causing an endless loop of variable generation.

  • Adds a condition to skip variablization of subtrees that are already variables
  • Adds two new test cases to verify the fix and test aggregation-to-subquery rewriting
  • Includes a new rule for aggregation to filtered subquery transformation

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
core/rule_generator.py Adds check to prevent re-variablization of already variablized subtrees
tests/test_rule_generator.py Adds test for general rule generation and updates existing test assertions
tests/test_query_rewriter.py Adds test for aggregation to subquery rewriting functionality
data/rules.py Adds new rule for aggregation to filtered subquery transformation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 1936 to 1948
assert StringUtil.strim(RuleGenerator._fingerPrint(rule['pattern'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
CAST(<x1> AS DATE)
''')) or StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
CAST(<<y>> AS DATE)
'''))

assert StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
<x1>
''')) or StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
<<y>>
'''))


Copy link

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test logic is duplicated and confusing with overlapping assertions. The first assertion checks both pattern and rewrite fingerprints, while the second assertion only checks rewrite fingerprints. Consider restructuring these assertions to be more clear and avoid the duplicated rewrite fingerprint check.

Suggested change
assert StringUtil.strim(RuleGenerator._fingerPrint(rule['pattern'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
CAST(<x1> AS DATE)
''')) or StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
CAST(<<y>> AS DATE)
'''))
assert StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
<x1>
''')) or StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite'])) == StringUtil.strim(RuleGenerator._fingerPrint('''
<<y>>
'''))
pattern_fp = StringUtil.strim(RuleGenerator._fingerPrint(rule['pattern']))
rewrite_fp = StringUtil.strim(RuleGenerator._fingerPrint(rule['rewrite']))
expected_pattern_fps = (
StringUtil.strim(RuleGenerator._fingerPrint('''
CAST(<x1> AS DATE)
''')),
StringUtil.strim(RuleGenerator._fingerPrint('''
CAST(<<y>> AS DATE)
'''))
)
expected_rewrite_fps = (
StringUtil.strim(RuleGenerator._fingerPrint('''
<x1>
''')),
StringUtil.strim(RuleGenerator._fingerPrint('''
<<y>>
'''))
)
assert pattern_fp in expected_pattern_fps
assert rewrite_fp in expected_rewrite_fps

Copilot uses AI. Check for mistakes.
for subtree in subtrees:
# Prevents already variablized subtrees from being variablized again in an infinite loop
# ex. {"value": "V22"} should not be variablized again
if len(subtree) == 1 and 'value' in subtree and QueryRewriter.is_var(subtree['value']):
Copy link

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition len(subtree) == 1 assumes a specific structure for variablized subtrees. This magic number should be documented or replaced with a more explicit check that describes what constitutes a variablized subtree structure.

Suggested change
if len(subtree) == 1 and 'value' in subtree and QueryRewriter.is_var(subtree['value']):
# Explicitly check for a variablized subtree structure: a dict with only the 'value' key whose value is a variable
if isinstance(subtree, dict) and set(subtree.keys()) == {'value'} and QueryRewriter.is_var(subtree['value']):

Copilot uses AI. Check for mistakes.
@baiqiushi
Copy link
Contributor

@colinthebomb1, can we merge this PR?

@colinthebomb1 colinthebomb1 merged commit 2db0c8a into main Sep 9, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants