Description
While testing the annotateTokens()
function (used by Token.cursor
of the python binding), I found that for some cursor, the (only) token that belongs to that cursor does not maps back to the cursor itself.
For example, on the following code,
struct a {
int b;
};
int func(struct a *ptr) {
int r = ptr->b;
return r;
}
I made a script that selects the DeclRefExpr
that refers ptr
in the statement int r = ptr->b
, and check if the only token that belongs to the expression, ptr
's cursor maps to the cursor.
from clang.cindex import TranslationUnit, Cursor, CursorKind
def main():
tu = TranslationUnit.from_source("./demo.c")
root: Cursor = tu.cursor
node = None
for node in root.walk_preorder():
if node.kind == CursorKind.DECL_REF_EXPR and node.spelling == "ptr":
break
token = None
for token in node.get_tokens():
break
print(token.cursor == node)
print(token.cursor._kind_id, node._kind_id)
print(token.cursor.xdata, node.xdata)
print(*token.cursor.data)
print(*node.data)
if __name__ == '__main__':
main()
The result of the above script is
False
101 101
0 0
140162768666120 140162768666224 140162768050240
None 140162768666224 140162768050240
The cursors node
and token.cursor
should be the same, and they indeed share the same spelling and extent. However, libclang
consider them as different cursors.
The equality of cursor is provided by clang_equalCursors()
, and the only difference between these two cursors are data[0]
.
llvm-project/clang/tools/libclang/CIndex.cpp
Lines 6289 to 6303 in 1c1eaf7
I suspect that the creation for DeclRefExpr
cursors are in MakeCXCursor()
, and data[0]
probably means the parent cursor.
llvm-project/clang/tools/libclang/CXCursor.cpp
Lines 570 to 583 in 1c1eaf7
llvm-project/clang/tools/libclang/CXCursor.cpp
Lines 876 to 878 in 1c1eaf7
There might be an issue where the data[0]
(parent) field is not being set properly, or clang_equalCursors()
should ignore data[0]
when comparing statements?