Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro prevents correct parsing of class #85

Open
theHamsta opened this issue Jun 27, 2020 · 15 comments
Open

Macro prevents correct parsing of class #85

theHamsta opened this issue Jun 27, 2020 · 15 comments

Comments

@theHamsta
Copy link
Contributor

theHamsta commented Jun 27, 2020

An imported marco (I simplified the source file), prevented the correct parsing of the following class.

#include <GLFW/glfw3.h>

PXR_NAMESPACE_USING_DIRECTIVE

class Scene
{

};

I know that in the presence of macro a parser almost has no chance to understand C++, but I hope that this failure case may be useful for improving the parser.



translation_unit [3, 0] - [15, 0])
  preproc_include [3, 0] - [6, 0])
    path: system_lib_string [3, 9] - [3, 23])
  declaration [6, 0] - [11, 2])
    type: type_identifier [6, 0] - [6, 29])
    ERROR [8, 0] - [8, 5])
      identifier [8, 0] - [8, 5])
    declarator: init_declarator [8, 6] - [11, 1])
      declarator: identifier [8, 6] - [8, 11])
      value: initializer_list [9, 0] - [11, 1])

@Shatur
Copy link

Shatur commented Jun 12, 2021

Simillar behavior can be caused by the following code:

class EXPORT_API MyClass
{
    MyClass();
};

Where EXPORT_API is a macro that is used to export class functions to DLLs on Windows on Windows and expands to nothing on Linux.

@ner0-m
Copy link

ner0-m commented Aug 21, 2021

I'll add something as well:

#include "doctest/doctest.h" 
 
TEST_CASE_TEMPLATE("Some Test", T, float, double)
{
} 

Generates this:

translation_unit [0, 0] - [5, 0]
  preproc_include [0, 0] - [1, 0]
    path: string_literal [0, 9] - [0, 28]
  ERROR [2, 0] - [4, 1]
    identifier [2, 0] - [2, 18]
    string_literal [2, 19] - [2, 30]
    identifier [2, 32] - [2, 33]
    ERROR [2, 35] - [2, 49]
      primitive_type [2, 35] - [2, 40]
      primitive_type [2, 42] - [2, 48]
    initializer_list [3, 0] - [4, 1]

I'm not sure if that is in any way detectable. But I wanted to report it, as it breaks my syntax highlighting in some cases.

@MarcelRobitaille
Copy link

I understand thst its impossible to detect macros in such a grammar since it's not hooked into the preprocessor. Would there be any way to let the end user manually define a list of strings that are always macros? My company only uses a handful, so it would be easy to define, and it would be great to have them not break the rest of the file.

@aryx
Copy link
Contributor

aryx commented Mar 6, 2024

@maxbrunsfeld would be nice indeed for C/C++ to provide a way to customize the parser to accept those macros.
I don't really know how to do that though with the way the parsers are written.

@MarcelRobitaille
Copy link

Would it be possible to do this with an injection?

@deeedob
Copy link

deeedob commented Apr 17, 2024

Hey, is there any update? This behavior is really annoying, I work in repos where a namespace macro is common, thus breaking all the highlighting there is. It would already be nice if we could ignore those macros somehow. Is there any solution?

@MarcelRobitaille
Copy link

@deeedob The best solution I found was to fork the project and add support for some hard-coded macros. Hopefully something better will come like #85 (comment)

@ImmanuelHaffner
Copy link

Well, i ditched tree-sitter-based features for the most part and use LSP semantic tokens instead. It's probably slower, but also nicer. You can have different Highlight groups for local variables vs. global variables vs. fields and such. All i had to do to make it work is use a color scheme with semantic token support.

I still have tree-sitter around for some other plugins, like lukas-reineke/indent-blankline.nvim, to get visual highlights for an entire scope or control-flow construct. Works ok so far

@MarcelRobitaille
Copy link

@ImmanuelHaffner What LSP do you use? I haven't found a good one for c++

@aryx
Copy link
Contributor

aryx commented Jun 6, 2024

@maxbrunsfeld @amaanq do you have any idea why the error recovery mechanism of tree-sitter does not skip those macros and at least parse correctly the rest of the code? I don't mind having the macro itself not parsed, but I mind if the presence of a macro makes almost the whole file to fail to parse.

@aryx
Copy link
Contributor

aryx commented Jun 6, 2024

For example on this simple example:

#include <foo.h>
FOOBAR(some_ident, "Some string)

#include <bar1.h>
#include <bar2.h>

namespace Foo
{
namespace Bar
{
  
Foo::Foo() {
  int x = 0 ;
  int y = 1;
  return x + y;
}
}
}

tree-sitter-cpp is not able to parse anything. It does not recover from the error.

@aryx
Copy link
Contributor

aryx commented Jun 6, 2024

Weirdly, when I try it on the playground https://tree-sitter.github.io/tree-sitter/playground with this example, it actually recovers well from the error:
image

@amaanq which version of tree-sitter-cpp is running in the playground? The latest?

@aryx
Copy link
Contributor

aryx commented Jun 6, 2024

hmm, after a few tests the issue for my example seems to be in ocaml-tree-sitter-semgrep, not tree-sitter itself. tree-sitter-c and tree-sitter-cpp with tree-sitter generate and tree-sitter parse seems to recover correctly from such macros. semgrep itself does not apparently.

@ImmanuelHaffner
Copy link

ImmanuelHaffner commented Jun 24, 2024

@MarcelRobitaille simply clangd, with clangd_extensions

@zadirion
Copy link

zadirion commented Jul 4, 2024

this is quite the pain for c++ when you have export macros in front of class declarations:

class EXPORT_API SomeClassName
{
};

For me this ends up being interpreted as a function instead of a class by treesitter. This of course messes with treesitter based text objects.
Would be good as a user to be able to manually specify what code a macro can expand to, including expanding to nothing
I'd be perfectly fine with having a .treesitter file somewhere in the directory structure of my parsed file, similar to .clang-format files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants