Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro future proofing #550

Merged
merged 13 commits into from
Jan 19, 2015
Prev Previous commit
Next Next commit
Minor fixes, adjustments to FOLLOW sets
  • Loading branch information
emberian committed Jan 3, 2015
commit 68ecb347d2c2a736478cb0367f9996a844570f36
32 changes: 15 additions & 17 deletions text/0000-macro-future-proofing.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

**NOTE**: Draft, not finalized.

# Key Terminology

- `macro`: anything invokable as `foo!(...)` in source code.
Expand Down Expand Up @@ -87,9 +85,9 @@ allowed tokens for the given NT's fragment specifier, and is defined below.
2. For each token `T` in `M`:
1. If `T` is not an NT, continue.
2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If
`T'` is `EOF`, replace `T'` with `F`. If `T'` is in the set
`FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, or `T'` is any identifier,
continue. Else, reject.
`T'` is `EOF` or a close delimiter of a token tree, replace `T'` with
`F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT,
or `T'` is any identifier, continue. Else, reject.
3. Else, `T` is a complex NT.
1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on
the contents with `F` set to `EOF`. If it accepts, continue, else,
Expand All @@ -105,21 +103,16 @@ emitted and compilation should not complete.
The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`,
`expr`, `ty`, `ident`, `path`, `meta`, and `tt`.

- `FOLLOW(item)` = `{}`
- `FOLLOW(block)` = `FOLLOW(expr)`
- `FOLLOW(stmt)` = `FOLLOW(expr)`
- `FOLLOW(pat)` = `{FatArrow, Comma, Pipe}`
- `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, Lit}` (where
`Lit` is any literal, string or numeric)
- `FOLLOW(ty)` = `{Comma, Eq, Gt, Lt, RArrow, FatArrow, OpenBrace, OpenParen,
CloseBrace, CloseParen}`
- `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, CloseBracket}`
- `FOLLOW(ty)` = `{Comma, CloseBrace, CloseParen, CloseBracket}`
- `FOLLOW(block)` = any token
- `FOLLOW(ident)` = any token
- `FOLLOW(path)` = any token
- `FOLLOW(meta)` = any token
- `FOLLOW(tt)` = any token

**Note**: the `FOLLOW` sets as given are based on every MBE in the Rust
distribution, but should probably be tuned before the RFC is accepted.
- `FOLLOW(item)` = up for discussion
- `FOLLOW(path)` = up for discussion
- `FOLLOW(meta)` = up for discussion

# Drawbacks

Expand All @@ -146,4 +139,9 @@ reasonable freedom.

# Unresolved questions

Are the given `FOLLOW` sets adequate?
1. What should the FOLLOW sets for `item`, `path`, and `meta` be?
2. Should the `FOLLOW` set for `ty` be extended? In practice, `RArrow`,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot one, of course: =

`Colon`, `as`, and `in` are also used. (See next item)
2. What, if any, identifiers should be allowed in the FOLLOW sets? The author
is concerned that allowing arbitrary identifiers would limit the future use
of "contextual keywords".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give some intuition of what the effects are of including or excluding tokens in the FOLLOW sets? I.e., doe sit mean the new token can be used a delimiter token? What does it mean for future proofing? Does it restrict or extend what we can do in the future? How?

I have a feeling we should think about semi-colons, but I'm not sure how. Should they be in the follow sets for either expr or stmt? Is it true that an item must always end with a } or a ;? If so, does that mean we should take anything for FOLLOW for item? (I feel I only have about a 50% grasp of the concepts here, so forgive my possibly stupid questions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a token is in the FOLLOW set of a nt, we can never change the language in such a way that parsing that nt would consume that token. This gives us a rigid boundary around which we can change the language and not break macros.

I think the FOLLOW for item could be everything. I think the FOLLOW for meta could be anything, but I'm not sure how that interacts with future plans for letting attributes contain arbitrary token trees. I really don't know about the futures of path/ty. path really ought to be removed, since there isn't really a single "path" that makes sense anymore.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider foo @ pat, an example of $_:pat. We might want to extend the pattern syntax so that pattern guards are allowed in patterns: foo @ pat if condition[1]. Now, let's assume this syntax is going to be added in 1.1. The input foo @ pat if something will continue to be accepted by the matcher ( $bar:pat if something ) for some time, but it will be rejected as soon as 1.1 comes. The culprit is the if in ( $bar:pat if something ) and hence if should be excluded from FOLLOW(pat).

Looks like it's implied that each FOLLOW set contains all identifiers:

If (...) T' is any identifier, continue. Else, reject.

Further, alternation (|) in patterns has been proposed in [1]. I think FOLLOW(pat) shouldn't include Pipe, too.
Edit: in both cases, the syntax that pat parses could be fixed.

[1] RFC: Extend pattern syntax #99, postponed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how we ought to handle identifiers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ty I have in the implementation as, ,, ->, :, =, and >. Any token for meta and item. Still not sure about path.