Description
Hi all,
I compiled the latest version of Gumtree from the source and have been using it. Really great project! I'm having some issues with the textdiff
command, unfortunately. It appears that the output between the three formats is inconsistent with each other, mainly with their ranges.
(Not that I think it matters, but I'm currently running Fedora 38.)
Steps to Reproduce
I have the following Java files, A.java
, B1.java
, and B2.java
, which I have attached to this issue:
A.java
is the original file and contains a simple class with one field calledqwerty
.B1.java
changes the field's name toasdfg
and addsSystem.out.println("hello")
to the constructor.B2.java
is the same asB1.java
, but with 10 extra lines added immediately before the curly brace opening the class.
I then ran the following (I've attached these files to my issue as well):
./gumtree textdiff A.java B1.java -f TEXT > A-B1-text.txt
./gumtree textdiff A.java B1.java -f JSON > A-B1-json.txt
./gumtree textdiff A.java B1.java -f XML > A-B1-xml.txt
./gumtree textdiff A.java B2.java -f TEXT > A-B2-text.txt
./gumtree textdiff A.java B2.java -f JSON > A-B2-json.txt
./gumtree textdiff A.java B2.java -f XML > A-B2-xml.txt
Behavior
All three formats handled update-node
the same way, but there is an inconsistency between their handling of insert-tree
. Each insert-tree
has tree
as an ExpressionStatement
and parent
as a Block
. This is correct. However, the ranges differ between formats. Here's the inconsistency shown in a table:
Format | A to B1 | A to B2 |
---|---|---|
TEXT | tree: [116, 144] parent: [111, 116] |
tree: [126, 154] parent: [111, 116] |
JSON | tree: [116, 144] parent: [110, 148] |
tree: [126, 154] parent: [120, 158] |
XML | tree: [116, 144] parent: [110, 148] |
tree: [126, 154] parent: [120, 158] |
It appears that TEXT is taking parent
from A.java
, while JSON and XML are taking parent
from B1/2.java
. You can verify this because TEXT's parent
fields do not change between the two diffs, while JSON's and XML's increase by 10.
Which file format has the correct version? Personally, I think TEXT makes the most sense, as you are inserting the tree at a point in the old tree.