Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird parsing results with nested template argument list #192

Open
ekilmer opened this issue Feb 24, 2023 · 4 comments
Open

Weird parsing results with nested template argument list #192

ekilmer opened this issue Feb 24, 2023 · 4 comments

Comments

@ekilmer
Copy link

ekilmer commented Feb 24, 2023

This example gives weird/inconsistent results leading me to believe there's an issue with parsing nested templates:

map<string, string> m;
map<string, vector<string>> m2;
map<string, vector<vector<string>>> v;

I expect to see all types highlighted, but the last line fails to highlight string in the first index to map; instead parsing it as an identifier when I expect type_identifier:

Screenshot 2023-02-24 at 9 42 46 AM

@jdrouhard
Copy link
Collaborator

jdrouhard commented Mar 2, 2023

@maxbrunsfeld - if you have a bit of time to spare to look at this, I'd really appreciate it. This is completely baffling to me:

map<string, vector<vector<string>>> v;

produces:

(translation_unit [0, 0] - [1, 0]
  (declaration [0, 0] - [0, 38]
    type: (template_type [0, 0] - [0, 35]
      name: (type_identifier [0, 0] - [0, 3])
      arguments: (template_argument_list [0, 3] - [0, 35]
        (identifier [0, 4] - [0, 10]) <-- this should be (type_descriptor (type_identifier))
        (type_descriptor [0, 12] - [0, 34]
          type: (template_type [0, 12] - [0, 34]
            name: (type_identifier [0, 12] - [0, 18])
            arguments: (template_argument_list [0, 18] - [0, 34]
              (type_descriptor [0, 19] - [0, 33]
                type: (template_type [0, 19] - [0, 33]
                  name: (type_identifier [0, 19] - [0, 25])
                  arguments: (template_argument_list [0, 25] - [0, 33]
                    (type_descriptor [0, 26] - [0, 32]
                      type: (type_identifier [0, 26] - [0, 32]))))))))))
    declarator: (identifier [0, 36] - [0, 37])))

But, putting the same exact code as a field declaration makes it work:

struct Foo {
    map<string, vector<vector<string>>> v;
};
(translation_unit [0, 0] - [3, 0]
  (struct_specifier [0, 0] - [2, 1]
    name: (type_identifier [0, 7] - [0, 10])
    body: (field_declaration_list [0, 11] - [2, 1]
      (field_declaration [1, 4] - [1, 42]
        type: (template_type [1, 4] - [1, 39]
          name: (type_identifier [1, 4] - [1, 7])
          arguments: (template_argument_list [1, 7] - [1, 39]
            (type_descriptor [1, 8] - [1, 14]  <-- now it's correct
              type: (type_identifier [1, 8] - [1, 14]))
            (type_descriptor [1, 16] - [1, 38]
              type: (template_type [1, 16] - [1, 38]
                name: (type_identifier [1, 16] - [1, 22])
                arguments: (template_argument_list [1, 22] - [1, 38]
                  (type_descriptor [1, 23] - [1, 37]
                    type: (template_type [1, 23] - [1, 37]
                      name: (type_identifier [1, 23] - [1, 29])
                      arguments: (template_argument_list [1, 29] - [1, 37]
                        (type_descriptor [1, 30] - [1, 36]
                          type: (type_identifier [1, 30] - [1, 36]))))))))))
        declarator: (field_identifier [1, 40] - [1, 41])))))

The only difference is the grandparent node of the template_argument_list (field_declaration vs declaration). Putting the code inside a function scope where it's still a normal declaration still produces the incorrect identifier.

Here's the template_type and template_argument_list rules:

template_type: $ => seq(
  field('name', $._type_identifier),
  field('arguments', $.template_argument_list)
),

template_argument_list: $ => seq(
  '<',
  commaSep(choice(
    prec.dynamic(3, $.type_descriptor),
    prec.dynamic(2, alias($.type_parameter_pack_expansion, $.parameter_pack_expansion)),
    prec.dynamic(1, $._expression)
  )),
  alias(token(prec(1, '>')), '>')
),

When the template_type is in a declaration, it's choosing $._expression (which ends up as identifier) instead of type_descriptor, which also matches and should have higher precedence for the conflict. I produced the graphviz dot debug output for both, but was having a hard time figuring out why the two cases are reducing differently so this is a bit above my ability.

@amaanq
Copy link
Member

amaanq commented Jul 25, 2023

setting _expression (well now _expression_not_binary as that's the one w/ conflicts) to have a dynamic precedence of -1 fixes this, but causes issues elsewhere. maybe removing identifiers from the expressions in template argument lists is an option

@lawmurray
Copy link

lawmurray commented Oct 27, 2023

I have a similar example, although it produces a hard error, not just a misclassified node:

optional<tuple<A<W>,B<X>,C<Y>,D<Z>>> f;

This produces:

(translation_unit [0, 0] - [2, 0]
  (expression_statement [0, 0] - [0, 39]
    (parameter_pack_expansion [0, 0] - [0, 38]
      pattern: (binary_expression [0, 0] - [0, 38]
        left: (template_function [0, 0] - [0, 35]
          name: (identifier [0, 0] - [0, 8])
          arguments: (template_argument_list [0, 8] - [0, 35]
            (template_function [0, 9] - [0, 24]
              name: (identifier [0, 9] - [0, 14])
              arguments: (template_argument_list [0, 14] - [0, 24]
                (type_descriptor [0, 15] - [0, 19]
                  type: (template_type [0, 15] - [0, 19]
                    name: (type_identifier [0, 15] - [0, 16])
                    arguments: (template_argument_list [0, 16] - [0, 19]
                      (type_descriptor [0, 17] - [0, 18]
                        type: (type_identifier [0, 17] - [0, 18])))))
                (binary_expression [0, 20] - [0, 23]
                  left: (identifier [0, 20] - [0, 21])
                  right: (identifier [0, 22] - [0, 23]))))
            (type_descriptor [0, 25] - [0, 29]
              type: (template_type [0, 25] - [0, 29]
                name: (type_identifier [0, 25] - [0, 26])
                arguments: (template_argument_list [0, 26] - [0, 29]
                  (type_descriptor [0, 27] - [0, 28]
                    type: (type_identifier [0, 27] - [0, 28])))))
            (template_function [0, 30] - [0, 34]
              name: (identifier [0, 30] - [0, 31])
              arguments: (template_argument_list [0, 31] - [0, 34]
                (type_descriptor [0, 32] - [0, 33]
                  type: (type_identifier [0, 32] - [0, 33]))))))
        right: (identifier [0, 37] - [0, 38])))))
test.hpp	0 ms	(MISSING "..." [0, 38] - [0, 38])

It appears to give precedence to a parameter_pack_expansion, then cannot find the ... at the end. The level of nesting (optional<tuple<...>> not just tuple<...>) and number of template arguments (optional<tuple<A<W>,B<X>,C<Y>,D<Z>>> not just optional<tuple<A<W>,B<X>,C<Y>>> is necessary to trigger the error.

The change that @amaanq mentions above (to set the dynamic precedence of _expression to -1 in the part of the code that @jdrouhard posted) removes the hard error, although now we're back to a misclassified node:

(translation_unit [0, 0] - [2, 0]
  (declaration [0, 0] - [0, 39]
    type: (template_type [0, 0] - [0, 36]
      name: (type_identifier [0, 0] - [0, 8])
      arguments: (template_argument_list [0, 8] - [0, 36]
        (type_descriptor [0, 9] - [0, 35]
          type: (template_type [0, 9] - [0, 35]
            name: (type_identifier [0, 9] - [0, 14])
            arguments: (template_argument_list [0, 14] - [0, 35]
              (template_function [0, 15] - [0, 19]  <<<<<<<<********* template_function, should be type_descriptor?
                name: (identifier [0, 15] - [0, 16])
                arguments: (template_argument_list [0, 16] - [0, 19]
                  (type_descriptor [0, 17] - [0, 18]
                    type: (type_identifier [0, 17] - [0, 18]))))
              (type_descriptor [0, 20] - [0, 24]
                type: (template_type [0, 20] - [0, 24]
                  name: (type_identifier [0, 20] - [0, 21])
                  arguments: (template_argument_list [0, 21] - [0, 24]
                    (type_descriptor [0, 22] - [0, 23]
                      type: (type_identifier [0, 22] - [0, 23])))))
              (type_descriptor [0, 25] - [0, 29]
                type: (template_type [0, 25] - [0, 29]
                  name: (type_identifier [0, 25] - [0, 26])
                  arguments: (template_argument_list [0, 26] - [0, 29]
                    (type_descriptor [0, 27] - [0, 28]
                      type: (type_identifier [0, 27] - [0, 28])))))
              (type_descriptor [0, 30] - [0, 34]
                type: (template_type [0, 30] - [0, 34]
                  name: (type_identifier [0, 30] - [0, 31])
                  arguments: (template_argument_list [0, 31] - [0, 34]
                    (type_descriptor [0, 32] - [0, 33]
                      type: (type_identifier [0, 32] - [0, 33]))))))))))
    declarator: (identifier [0, 37] - [0, 38])))

Well, this is not an incorrect parse, that could indeed be a template_function, but it seems inconsistent to prefer template_function for the first template argument, but type_descriptor for the others.

To have a shot in the dark, I wonder whether line 155 has something to do with the precedences resolving this way, but have not found an alternative that keeps other tests working:

    type_descriptor: (_, original) => prec.right(original),

@lawmurray
Copy link

A further update: setting the precedence of _expression to 0 rather than -1 produces the same output above (i.e. one of the arguments is template_function), but now tree-sitter test passes at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants