Skip to content

bug: escape_sequence not detected as a change when toggling prefix "r" for the string #272

Open
@8day

Description

@8day

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-python

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.22.3

Describe the bug

Using old_tree.changed_ranges(new_tree) Python parser does not detect removal or insertion of node escape_sequence when switching between plain string and r-prefixed-string.

Toggling of prefix r for the string results in a change of node string_start, but while string_content, parent of escape_sequence, has no changes in content, its structure changes when escape_sequence is detected/ignored.

Note that it seems that the equivalent changes to f-prefixed-string are detected as expected.

P.S. Sorry for example written in Python, but I don't know C/CLI scripts to reproduce the bug. Toggle commented/uncommented strings to switch between r-string and f-string.

Steps To Reproduce/Bad Parse Tree

  1. Create text file with a string containing escape sequence: "for whom the \x07 {'tolls'}".
  2. Parse it to get tree A: (module (expression_statement (string (string_start) (string_content (escape_sequence)) (string_end)))).
  3. Edit string by adding prefix r: r"for whom the \x07 {'tolls'}".
  4. Parse it to get tree B: (module (expression_statement (string (string_start) (string_content) (string_end)))).
  5. Call A.changed_ranges(B), and receive this output: [<Range ... start_byte=0, end_byte=1>].
  6. Edit string by removing prefix r: "for whom the \x07 {'tolls'}".
  7. Parse it to get tree C: (module (expression_statement (string (string_start) (string_content (escape_sequence)) (string_end)))).
  8. Call B.changed_ranges(C), and receive this output: [].

Expected Behavior/Parse Tree

A.changed_ranges(B) should have resulted in this output: [<Range ... start_byte=0, end_byte=1>, <Range ... start_byte=15, end_byte=19>].
B.changed_ranges(C) should have resulted in this output (indexes are approximate and should have spanned same range as escape sequence): [<Range ... start_byte=14, end_byte=18>].

Repro

from tree_sitter import Language, Parser
import tree_sitter_python

def make_byte_feeder(src):
    def feeder(pos, point):
        b = src[pos:pos+1]
        print(b.decode('utf-8'), end='')
        return b
    return feeder

# Empty `text` implies removal of selection.
# Non-empty `text` with `selection_start == selection_end` implies insertion.
# Non-empty `text` with `selection_start != selection_end` implies replacement.
def edit_tree(tree, src, selection_start, selection_end, text):
    new_src = src[:selection_start] + text + src[selection_end:]

    print('<'*10)
    tree.edit(
        start_byte=selection_start,
        old_end_byte=selection_end,
        new_end_byte=selection_start + len(text),
        start_point=(0, 0),
        old_end_point=(0, 0),
        new_end_point=(0, 0),
    )
    new_tree = parser.parse(make_byte_feeder(new_src), tree)
    print()
    print('>'*10)

    print('org:', src)
    print('alt:', new_src, end='\n\n')
    print('org root node:', tree.root_node)
    print('alt root node:', new_tree.root_node, end='\n\n')

    print('changes:', tree.changed_ranges(new_tree))

    return new_tree, new_src

src = r'''"for whom the \x07 {'tolls'}"'''.encode('utf-8')

parser = Parser(Language(tree_sitter_python.language()))
print('<'*10)
tree = parser.parse(make_byte_feeder(src))
print()
print('>'*10)

# TEST R-STRING.

old_tree = tree
tree, src = edit_tree(tree, src, 0, 0, 'r'.encode('utf-8'))
print('string changed:', old_tree.root_node.child(0).child(0).has_changes)
print('org string start change:', old_tree.root_node.child(0).child(0).child(0), old_tree.root_node.child(0).child(0).child(0).has_changes)
print('org string chld2 change:', old_tree.root_node.child(0).child(0).child(1), old_tree.root_node.child(0).child(0).child(1).has_changes)

old_tree = tree
tree, src = edit_tree(tree, src, 17, 19, '10'.encode('utf-8'))
print('string changed:', old_tree.root_node.child(0).child(0).has_changes)
print('org string start change:', old_tree.root_node.child(0).child(0).child(0), old_tree.root_node.child(0).child(0).child(0).has_changes)
print('org string chld2 change:', old_tree.root_node.child(0).child(0).child(1), old_tree.root_node.child(0).child(0).child(1).has_changes)

old_tree = tree
tree, src = edit_tree(tree, src, 0, 1, b'')
print('string changed:', old_tree.root_node.child(0).child(0).has_changes)
print('org string start change:', old_tree.root_node.child(0).child(0).child(0), old_tree.root_node.child(0).child(0).child(0).has_changes)
print('org string chld2 change:', old_tree.root_node.child(0).child(0).child(1), old_tree.root_node.child(0).child(0).child(1).has_changes)

# TEST F-STRING.

# old_tree = tree
# tree, src = edit_tree(tree, src, 0, 0, 'f'.encode('utf-8'))
# print('string changed:', old_tree.root_node.child(0).child(0).has_changes)
# print('org string start change:', old_tree.root_node.child(0).child(0).child(0), old_tree.root_node.child(0).child(0).child(0).has_changes)
# print('org string chld2 change:', old_tree.root_node.child(0).child(0).child(2), old_tree.root_node.child(0).child(0).child(2).has_changes)

# old_tree = tree
# tree, src = edit_tree(tree, src, 22, 27, 'rings'.encode('utf-8'))
# print('string changed:', old_tree.root_node.child(0).child(0).has_changes)
# print('org string start change:', old_tree.root_node.child(0).child(0).child(0), old_tree.root_node.child(0).child(0).child(0).has_changes)
# print('org string chld2 change:', old_tree.root_node.child(0).child(0).child(2), old_tree.root_node.child(0).child(0).child(2).has_changes)

# old_tree = tree
# tree, src = edit_tree(tree, src, 0, 1, b'')
# print('string changed:', old_tree.root_node.child(0).child(0).has_changes)
# print('org string start change:', old_tree.root_node.child(0).child(0).child(0), old_tree.root_node.child(0).child(0).child(0).has_changes)
# print('org string chld2 change:', old_tree.root_node.child(0).child(0).child(2), old_tree.root_node.child(0).child(0).child(2).has_changes)

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions