Skip to content

textdiff has inconsistent range output between formats #334

Open
@JonahSussman

Description

@JonahSussman

Hi all,

I compiled the latest version of Gumtree from the source and have been using it. Really great project! I'm having some issues with the textdiff command, unfortunately. It appears that the output between the three formats is inconsistent with each other, mainly with their ranges.

(Not that I think it matters, but I'm currently running Fedora 38.)

Steps to Reproduce

I have the following Java files, A.java, B1.java, and B2.java, which I have attached to this issue:

  • A.java is the original file and contains a simple class with one field called qwerty.
  • B1.java changes the field's name to asdfg and adds System.out.println("hello") to the constructor.
  • B2.java is the same as B1.java, but with 10 extra lines added immediately before the curly brace opening the class.

I then ran the following (I've attached these files to my issue as well):

./gumtree textdiff A.java B1.java -f TEXT > A-B1-text.txt
./gumtree textdiff A.java B1.java -f JSON > A-B1-json.txt
./gumtree textdiff A.java B1.java -f XML  > A-B1-xml.txt
./gumtree textdiff A.java B2.java -f TEXT > A-B2-text.txt
./gumtree textdiff A.java B2.java -f JSON > A-B2-json.txt
./gumtree textdiff A.java B2.java -f XML  > A-B2-xml.txt

Behavior

All three formats handled update-node the same way, but there is an inconsistency between their handling of insert-tree. Each insert-tree has tree as an ExpressionStatement and parent as a Block. This is correct. However, the ranges differ between formats. Here's the inconsistency shown in a table:

Format A to B1 A to B2
TEXT tree: [116, 144]
parent: [111, 116]
tree: [126, 154]
parent: [111, 116]
JSON tree: [116, 144]
parent: [110, 148]
tree: [126, 154]
parent: [120, 158]
XML tree: [116, 144]
parent: [110, 148]
tree: [126, 154]
parent: [120, 158]

It appears that TEXT is taking parent from A.java, while JSON and XML are taking parent from B1/2.java. You can verify this because TEXT's parent fields do not change between the two diffs, while JSON's and XML's increase by 10.

Which file format has the correct version? Personally, I think TEXT makes the most sense, as you are inserting the tree at a point in the old tree.

Files

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions