Improve error recovery for unclosed `use` tree lists. #4680

nelhage · 2020-06-01T00:46:31Z

Previously, given an unclosed use name::{…} would result in the
entire rest of the file being parsed as part of the use_tree_list,
generating errors on virtually every token in the file. This spew of
errors is unhelpful, can slow down editors, and in the case of emacs
lsp-mode + flycheck (my environment), will cause the editor to stop
displaying errors entirely under the assumption that the checker must
be broken.

We solve this by bailing out of the use_tree_list as soon as we
encounter a token that isn't a , or } between or after items.

This has the downside that we now do a slightly worse job of
recovering from syntax errors inside of a balanced use_tree_list,
for instance, in use std::{thread, , mem}. Previously, we would
report one syntax error and then correct emit a use_tree for mem
and close the list; now we'll bail early and emit errors until the
semicolon, at which point we resync.

This represents a fundamental tradeoff (as best I can tell) in doing
error-recover without lookahead; when we encounter that first parse
error, we have to either guess that we should continue the
use_tree_list, or that we should bail. I believe that bailing is
superior because it avoids the unbounded desync, and because it vastly
improves the common case of adding a new use clause from scratch
near the top of an existing file.

We could potentially improve the other case slightly by discarding
tokens until we find something in ITEM_RECOVERY_SET, which would
reduce the number of errors we spew, but that didn't seem to be a
pre-existing idiom in the code base.

Previously, given an unclosed `use name::{…}` would result in the entire rest of the file being parsed as part of the `use_tree_list`, generating errors on virtually every token in the file. This spew of errors is unhelpful, can slow down editors, and in the case of emacs lsp-mode + flycheck (my environment), will cause the editor to stop displaying errors entirely under the assumption that the checker must be broken. We solve this by bailing out of the `use_tree_list` as soon as we encounter a token that isn't a `,` or `}` between or after items. This has the downside that we now do a slightly worse job of recovering from syntax errors _inside_ of a balanced `use_tree_list`, for instance, in `use std::{thread, , mem}`. Previously, we would report one syntax error and then correct emit a `use_tree` for `mem` and close the list; now we'll bail early and emit errors until the semicolon, at which point we resync. This represents a fundamental tradeoff (as best I can tell) in doing error-recover without lookahead; when we encounter that first parse error, we have to either guess that we should continue the `use_tree_list`, or that we should bail. I believe that bailing is superior because it avoids the unbounded desync, and because it vastly improves the common case of adding a new `use` clause from scratch near the top of an existing file. We could potentially improve the other case slightly by discarding tokens until we find something in `ITEM_RECOVERY_SET`, which would reduce the number of errors we spew, but that didn't seem to be a pre-existing idiom in the code base.

nelhage

Added some inline notes on the diff. None of them seemed appropriate for comments in the code, but I'm happy to add them if you think they'd be useful there!

nelhage · 2020-06-01T00:47:31Z

crates/ra_parser/src/grammar/items/use_item.rs

 /// Note that this is called both by `use_item` and `use_tree_list`,
 /// so handles both `some::path::{inner::path}` and `inner::path` in
 /// `use some::path::{inner::path};`
-fn use_tree(p: &mut Parser, top_level: bool) {


This substantially reverts #1866; The new behavior also fixes the infinite loop and results in a better error-recovery on the provided test case.

nelhage · 2020-06-01T00:48:16Z

crates/ra_syntax/src/parsing/reparsing.rs

            "n next(",
            9,
        );
-        do_check(r"use a::b::{foo,<|>,bar<|>};", "baz", 10);


This test is no longer eligible for block-based reparsing because the new error-recovery behavior no longe results in a single parse tree covering the {…} block.

nelhage · 2020-06-01T00:49:05Z

crates/ra_syntax/test_data/parser/err/0036_partial_use.rast

-        ERROR@35..36
-          SEMICOLON@35..36 ";"
+    SEMICOLON@22..23 ";"
+  WHITESPACE@23..24 "\n"


This is an example of the new behavior being superior on a pre-existing test case; we emit many fewer spurious errors, and manage to recover in time to capture the use tree on line 2 properly.

nelhage · 2020-06-01T00:49:38Z

crates/ra_syntax/test_data/parser/err/0044_bad_use_tree_list.rs

@@ -0,0 +1,3 @@
+use std::{thread:, mem};


This test case demonstrates the case in which the new behavior is slightly worse and emits more spurious errors, and drops the std::mem use.

nelhage · 2020-06-01T00:50:05Z

crates/ra_syntax/test_data/parser/ok/0066_use_items_trailing_comma.rs

@@ -0,0 +1 @@
+use ra_syntax::{TextSize, TextRange,};


While testing this, I accidentally broke the behavior of a use tree with a trailing comma. The bug was caught in the IDE tests, but there was no pre-exising parser test so I added one.

matklad · 2020-06-09T14:48:36Z

Sadly, I still don't have a lot of time to look into this, but let me quickly write down some notes, without looking into the code.

we try to maintain an invariant that for any parse tree {}pairs form the right parse sequence. That means that { and the corrsponding } always are first and last child of the same parent.
the reason we want this invariant is to enable simple heuristic for incremetal reparsing -- as long as the user typed anything which has {} balanced, it's always safe to reparse only the parent {} block
we however have troubles with enforcing this invariant -- we have another pair of delimiters, which is even more strong than {} -- invisible parenthesis (L_DOLLAR & R_DOLLAR) from macro expansion. We def hit bugs where dollars and {} conflicted with each other, but I don't remember details now.
I am actually not sure that, for incremental reparsing, the said invariant is required/sufficient

So, to unblock progress here we need:

to figure out what invariants we actually want
document them
add required asserts for checking them
ideally, add fuzzy/property tests for them

matklad · 2021-08-30T11:25:25Z

Urgs, sorry for still not getting back to this. This is super-important, but sadly I don't seem able to carve out the time to cleanup the parser. I'll close this PR, but I definetelly won't forget about it, and come back once I start with fixing the paresr :(

nelhage commented Jun 1, 2020

View reviewed changes

matklad mentioned this pull request Jun 23, 2020

Enforce Syntax Trees Invariantrs #5006

Open

matklad mentioned this pull request Oct 14, 2020

'the parser seems stuck' when expanding macros #6100

Closed

matklad mentioned this pull request Jan 10, 2021

Fixed expr meta var after path colons in mbe #7211

Merged

matklad mentioned this pull request Apr 6, 2021

Incorrect completion for clippy lints in #[allow()] #8369

Closed

matklad closed this Aug 30, 2021

lnicola mentioned this pull request Jan 2, 2024

better error recovery for USE_TREE_LIST parsing #16227

Closed

Veykril mentioned this pull request Jan 11, 2024

fix: add error recovery for use_tree_list parsing #16349

Merged

Veykril mentioned this pull request Jul 6, 2025

fix: Always bump in the parser in err_and_bump() #20180

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve error recovery for unclosed `use` tree lists. #4680

Improve error recovery for unclosed `use` tree lists. #4680

Uh oh!

nelhage commented Jun 1, 2020

Uh oh!

nelhage left a comment

Uh oh!

nelhage Jun 1, 2020

Uh oh!

nelhage Jun 1, 2020

Uh oh!

nelhage Jun 1, 2020

Uh oh!

nelhage Jun 1, 2020

Uh oh!

nelhage Jun 1, 2020

Uh oh!

matklad commented Jun 9, 2020

Uh oh!

matklad commented Aug 30, 2021

Uh oh!

Uh oh!

Improve error recovery for unclosed use tree lists. #4680

Improve error recovery for unclosed use tree lists. #4680

Uh oh!

Conversation

nelhage commented Jun 1, 2020

Uh oh!

nelhage left a comment

Choose a reason for hiding this comment

Uh oh!

nelhage Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

nelhage Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

nelhage Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

nelhage Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

nelhage Jun 1, 2020

Choose a reason for hiding this comment

Uh oh!

matklad commented Jun 9, 2020

Uh oh!

matklad commented Aug 30, 2021

Uh oh!

Uh oh!

Improve error recovery for unclosed `use` tree lists. #4680

Improve error recovery for unclosed `use` tree lists. #4680