Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abritrary expressions on the LHS of assignments #220

Open
elbeno opened this issue Aug 6, 2023 · 2 comments
Open

Abritrary expressions on the LHS of assignments #220

elbeno opened this issue Aug 6, 2023 · 2 comments

Comments

@elbeno
Copy link
Contributor

elbeno commented Aug 6, 2023

The grammar currently seems limited in how it deals with assignment expressions and in particular what can appear on the LHS of an assignment expression. Arbitrary expressions on the LHS of assignments are a common technique in building DSLs like e.g. Boost.SML.

To take a simple example, a UDL on the LHS:

void f() { "hello"_s = 42; }

(The UDL is effectively a function call which might return e.g. a reference to a variable.) tree-sitter parse says:

(translation_unit [0, 0] - [1, 0]
  (function_definition [0, 0] - [0, 28]
    type: (primitive_type [0, 0] - [0, 4])
    declarator: (function_declarator [0, 5] - [0, 8]
      declarator: (identifier [0, 5] - [0, 6])
      parameters: (parameter_list [0, 6] - [0, 8]))
    body: (compound_statement [0, 9] - [0, 28]
      (ERROR [0, 11] - [0, 22]
        (user_defined_literal [0, 11] - [0, 20]
          (string_literal [0, 11] - [0, 18]
            (string_content [0, 12] - [0, 17]))
          (literal_suffix [0, 18] - [0, 20])))
      (expression_statement [0, 23] - [0, 26]
        (number_literal [0, 23] - [0, 25])))))

Another example:

void f() { x + y = 5; }

(If this looks odd, consider a DSL with overloaded operators.) This produces a parse tree without error:

(translation_unit [0, 0] - [1, 0]
  (function_definition [0, 0] - [0, 23]
    type: (primitive_type [0, 0] - [0, 4])
    declarator: (function_declarator [0, 5] - [0, 8]
      declarator: (identifier [0, 5] - [0, 6])
      parameters: (parameter_list [0, 6] - [0, 8]))
    body: (compound_statement [0, 9] - [0, 23]
      (expression_statement [0, 11] - [0, 21]
        (binary_expression [0, 11] - [0, 20]
          left: (identifier [0, 11] - [0, 12])
          right: (assignment_expression [0, 15] - [0, 20]
            left: (identifier [0, 15] - [0, 16])
            right: (number_literal [0, 19] - [0, 20])))))))

But the tree is incorrect with operator= binding more tightly than the operator+: it's as if the code read x + (y = 5). Ironically that expression produces an error:

void f() { x + (y = 5); }
(translation_unit [0, 0] - [1, 0]
  (function_definition [0, 0] - [0, 25]
    type: (primitive_type [0, 0] - [0, 4])
    declarator: (function_declarator [0, 5] - [0, 8]
      declarator: (identifier [0, 5] - [0, 6])
      parameters: (parameter_list [0, 6] - [0, 8]))
    body: (compound_statement [0, 9] - [0, 25]
      (expression_statement [0, 11] - [0, 23]
        (binary_expression [0, 11] - [0, 22]
          left: (identifier [0, 11] - [0, 12])
          right: (parenthesized_expression [0, 15] - [0, 22]
            (ERROR [0, 16] - [0, 19]
              (identifier [0, 16] - [0, 17]))
            (number_literal [0, 20] - [0, 21])))))))

I'm not sure what should be done about this in general: perhaps some surgery on the LHS of assignment? I see that parenthesized expressions are allowed there by the current grammar, so this parses:

void f() { ("hello"_s) = 42; }
(translation_unit [0, 0] - [1, 0]
  (function_definition [0, 0] - [0, 30]
    type: (primitive_type [0, 0] - [0, 4])
    declarator: (function_declarator [0, 5] - [0, 8]
      declarator: (identifier [0, 5] - [0, 6])
      parameters: (parameter_list [0, 6] - [0, 8]))
    body: (compound_statement [0, 9] - [0, 30]
      (expression_statement [0, 11] - [0, 28]
        (assignment_expression [0, 11] - [0, 27]
          left: (parenthesized_expression [0, 11] - [0, 22]
            (user_defined_literal [0, 12] - [0, 21]
              (string_literal [0, 12] - [0, 19]
                (string_content [0, 13] - [0, 18]))
              (literal_suffix [0, 19] - [0, 21])))
          right: (number_literal [0, 25] - [0, 27]))))))
@amaanq
Copy link
Member

amaanq commented Aug 11, 2023

the first case should be fixed, I'll agree. About binary expressions - that's just asking for a mess of conflicts. So, would just fixing the first case be good enough?

@elbeno
Copy link
Contributor Author

elbeno commented Aug 11, 2023

Long term, IMO no. But it's a start.

@amaanq amaanq mentioned this issue Aug 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants