Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect query matches on a condition including "typeid" #92

Open
dnns92 opened this issue Feb 10, 2021 · 2 comments
Open

Incorrect query matches on a condition including "typeid" #92

dnns92 opened this issue Feb 10, 2021 · 2 comments

Comments

@dnns92
Copy link

dnns92 commented Feb 10, 2021

Dear Community,

I receive an incorrect query-capture using the treesitter api on a rare if-statement edge case. I want to grab the condition inside an if-statement. In this particular case, there is an if-statement, that contains the keyword "typeid". This leads to an incorrect grab, it doesnt find the end of the if-statement correctly and parses until the next if-statement finishes.

My Setup:

Win10
py3.8
tree-sitter-python with compiled tree-sitter-cpp language

I am using this query in order to grab the condition inside the if-statement:
DEFAULT_QUERY = "(if_statement(condition_clause)@if_statement)"

this works really well every case I tried out except one (admittedly rare) case I encountered so far. Consider this valid c++:

bool including_namespace_identifier_and_primitive_type(int x, int y) {
     if(std::strcmp(typeid(unsigned int).name(), something_else)))
        return false;
     // this is a comment that will be included IN THE result of the query!
     if(the.catched.condition.will.include.this_statement)
        return false;
     doSomething();
     return true;
}

using the query above and code[capture.start_byte : capture.stop_byte] we get this captured continous statement:

(std::strcmp(typeid(unsigned int).name(), something_else)))

        return false;

     // this is a comment that will be included IN THE result of the query!

     if(the.catched.condition.will.include.this_statement)

Instead of getting two seperate statements. I am not into generating valid grammar files so far, I hope its an easy fix for someone.

Example where this kind of c++ is used: https://github.com/apple/turicreate/blob/master/src/external/boost/boost_1_68_0/libs/interprocess/test/segment_manager_test.cpp#L337

---------------------------------------------- Some more Notes ----------------------------

If I delete the typeid it works as intended, see:

bool including_namespace_identifier_and_primitive_type(int x, int y) {
     if(std::strcmp((unsigned int).name()))
        return false;
     // this is a comment that will be included IN THE result of the query!
     if(the.catched.condition.will.include.this_statement)
        return false;
     doSomething();
     return true;
}

This finds two seperate statments:

(std::strcmp((unsigned int).name())) 
(the.catched.condition.will.include.this_statement)
@dnns92 dnns92 changed the title Incorrect query matches on a condition including a namespace and a two types Incorrect query matches on a condition including "typeid" Feb 10, 2021
@maxbrunsfeld
Copy link
Contributor

Can you report the output of tree-sitter parse <the-file> for the code that you're matching? It sounds like there is just a syntax error happening, so that we have a tree that's different from expected.

@dnns92
Copy link
Author

dnns92 commented Feb 10, 2021

I only have the python bindungs installed.
I think I can provide you the corresponding info like this:

exanples/weird.stuff.cpp

bool including_namespace_identifier_and_primitive_type(int x, int y) {
     if(std::strcmp(typeid(unsigned int).name(), something_else)))
        return false;
     // this is a comment that will be included IN THE result of the query!
     if(the.catched.condition.will.include.this_statement)
        return false;
     doSomething();
     return true;
}

python-script

from tree_sitter import Language, Parser
import pickle

relative_path_to_parser = 'build/my-languages.so'
file = "examples/weird_stuff.cpp"
LANGUAGE = Language(relative_path_to_parser, "cpp")
query_statement = "(if_statement(condition_clause)@if_statement)"


#%% read file
with open(file, "rb") as f:
    data = f.read()
f.close()

#%% parse
parser = Parser()
parser.set_language(LANGUAGE)
tree = parser.parse(data)
query = LANGUAGE.query(query_statement)
for finding in query.captures(tree.root_node):
    print(finding[0].sexp())

This returns:
(condition_clause initializer: (declaration type: (scoped_type_identifier namespace: (namespace_identifier) name: (type_identifier)) declarator: (parenthesized_declarator (ERROR (function_declarator declarator: (identifier) parameters: (parameter_list (parameter_declaration type: (sized_type_specifier type: (primitive_type))))) (function_declarator declarator: (identifier) parameters: (parameter_list))) (identifier)) (ERROR (false))) (comment) value: (call_expression function: (identifier) arguments: (argument_list (field_expression argument: (field_expression argument: (field_expression argument: (field_expression argument: (field_expression argument: (identifier) field: (field_identifier)) field: (field_identifier)) field: (field_identifier)) field: (field_identifier)) field: (field_identifier)))) (MISSING ")"))

Is this was you expected? I'm not very familiar with the API yet. Thank you very much in advance!
Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants