Skip to content

Consider whether repeated macro matchers need separator disambiguation for futureproofing #57603

Open
@alercah

Description

@alercah

The following macro is valid, and can be used:

macro_rules! ex {
    ($($i:expr),* , $($j:ty),*) => { };
}

fn main() {
    ex!(a, dyn Copy);
}

However, if it's invoked in an ambiguous way, such as with ex!(a, b), then it gives an error (no lookahead is performed to ) to determine an unambiguous parse; examples can be constructed that avoid this unambiguity anyway).

Thus, while the original example compiles now, were dyn Copy to become a legal expression at some point in the future, it would become ambiguous. The issue here is that the separator , is re-used. In line with other futureproofing efforts like rust-lang/rfcs#550 and #56575, this should possibly be forbidden. The follow-set rule is insufficient to detect these cases as it only exists to allow us to promise in advance where a match is guaranteed to end; it's not sufficient to disambiguate between two potential parses starting from the same point.

My first attempt at rules to forbid this are for the situation where we have ... $(tt ...) SEP? OP uu ..., with uu ... possibly empty. In these, FIRST* is a variant on FIRST from RFC 550: it represents all terminal tokens that could appear in a valid match. Thus, FIRST*($t:ty) would be any can_begin_type token, plus any token we worry might be legally allowed to begin a type someday, and so on.

  1. If SEP is present, we must have we must have SEP \not\in FIRST*(uu ...), insisting that the seperator token in repetition will never be ambiguous.
  2. When OP is not +, we must haveFIRST*(tt ...) and FIRST*(uu ...) be disjoint, insisting that the question of whether we're repeating at all will not be ambiguous.

For unseparated Kleene repeats, the second rule above, combined with the follow-set rule, are sufficient to disambiguate.

I have no idea how frequently these potential ambiguities arise in practice. It might be a lot, it might be a little. I expect that the first rule, for separators, might be safe to add because most macros that would risk running into them cannot be invoked except with an empty repetition due to existing ambiguity (the first macro can be invoked as ex!(,a), for instance, with an empty repetition for $i eliminating the ambiguity) and therefore could likely mostly be removed. The second rule, however, might be more commonly violated.

cc @estebank @alexreg

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions