Description
Over in:
@compiler-errors describes this general problem:
The breakage specifically represents an inherent limitation to the "macro follow-set" formulation which is supposed to make us more resilient against breakages due to extensions to the grammar like this.
Given two macro matcher arms:
($ty:ty) => ...
(($tt:tt)*) => ...
And given tokens like:
&
pin
mut
[...more tokens may follow...]On nightly today,
&pin
gets parsed as a type. However, we run out of matchers but still have tokens left (themut
token is next), so we fall through to the next arm. Since it's written like($tt:tt)*
, everything is allowed, and we match the second arm successfully...I think that's weird, because if this second arm were written like
$ty:ty mut
, that would be illegal, sincemut
is not in the follow-set of the:ty
matcher. Thus, we can use:tt
matchers to observe whether the compiler actually parses things not in our grammar that should otherwise be protected against, which seems pretty gross.
And @Noratrieb proposes a general solution:
I believe a solution to this would be the following new logic:
- after the end of a macro matcher arm has been reached
- and there are still input tokens remaining
- and if the last part of the matcher is a metavar
- ensure that the first remaining token is in the follow set of this metavar
- if it is, move on to the next arm
- if it is not, emit an error
What this semantically does is strengthen the "commit to fully matching metavars or error" behavior such that it extends past the end. I don't know how many macros rely on this, but it seems like emitting an FCW (instead of error) on such macro invocations would find all these cases and ensure that the follow-set logic is actually robust past the end. But imo this shouldn't block this PR (which should probably just ship as-is) and can be done separately.
This issue is to track the proposal for this FCW.
cc @Noratrieb @compiler-errors @eholk @rust-lang/lang