Description
Updated description
First implemented in #43230 the compiler can now tokenize a few AST nodes when necessary losslessly from the original token stream. Currently, however, this is somewhat buggy:
- It only happens for items in certain situations (aka no inner attributes). The consequence of this is that more and more attributed items will have invalid span information when sent to procedural macros.
- We don't invalidate the cache when later changing the item. The consequence of this is that procedural macros will receive a buggy view of what the AST node actually represents.
The "real bug" here is that we haven't actually implemented converting an AST to a token tree. The initial implementation in #43230 was effectively just a heuristic, and not even a great one! As a result we still need the ability basically to losslessly tokenize an AST node back to its original set of tokens.
Some bugs that arise from this are:
- Usage of
#[cfg]
modifies the AST but doesn't invalidate the cache -- Macros 2.0: #[cfg_attr] makes .to_string() and TokenStream disagree #48644 - Modules with attributes disagree on what their internal tokens are -- Inconsistency between Display and IntoIterator for a TokenStream containing a module #47627
macro_rules!
and procedural macros don't play well together -- Inconsistent tokenization of procedural macros with macro_rules #49846Invoking a macro with brackets loses span information -- Procedural macros stringify inputs with macro invocations too aggressively #50840
Original Description
There's an associated FIXME in the code right now, and to fix this we'll need to implement tokenization of an AST node. Right now the thinking of how to implement this is to save all TokenStream
instances adjacent to an AST node, and use that instead of converting back into a token stream
cc @dtolnay, @nrc, @jseyfried