Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hygiene opt-out (escaping) for declarative macros 2.0 #2498

Closed
wants to merge 4 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
221 changes: 221 additions & 0 deletions text/0000-macro-hygiene-optout.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
- Feature Name: macro_hygiene_optout
- Start Date: 2018-07-05
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

This feature introduces the ability to "opt-out" of the usual macro hygiene rules within definitions of [declarative macros][decl-macro], for designated identifiers or occurrences of identifiers. In other words, the feature will enable one to annotate occurrences of identifiers with macro call-site hygiene rather than the default definition-site hygiene.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be good to mention here that "declarative macros" does not refer to macro_rules! (it is apparent if you click the link, but in the interest of not having to do so...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I originally had this, but somehow removed it.


# Motivation
[motivation]: #motivation

The use of [hygienic macros] in Rust is justified by much prior research and experience, and solves several common issues that programmers would otherwise encounter with macros due to the nature of syntactical substitution. The principal deficit of this approach is that it requires that names/identifiers of any items generated by a macro be *explicitly passed to* the macro as arguments. This both requires the logic for name selection to remain entirely external to the macro, and even if that is not a problem, the passing of all identifiers-to-export into a macro can quickly become unwieldy for macros that generate many identifiers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

justified by much prior research and experience

A link would be good for curious readers :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "hygienic macros" links offers good justification, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truthfully I expected more papers and citations given "much prior research", but I suppose it's enough :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, okay, I'll add one or two!


# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Escaping of hygiene for identifiers within macros allows one to define identifiers with syntax contexts (**hygiene**) corresponding to the place the macro is invoked (the **call-site**) rather than the place it is defined (**definition-site**). It also enables one to use/reference existing identifiers from the call-site from within macro definitions, though this is not the true aim of the feature, but rather a side-effect, and will be discussed later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be more clear: "Place" => "location in the source code"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point.


Note that for the purposes of this RFC, an **identifier** can roughly be considered to be an textual name (e.g. `foo_bar`) of any sort (for a variable, function, trait, etc.) or a lifetime (e.g. `'a`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the relation of this RFC to #2151?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None. I might add a sentence to make that clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all lifetime parameters are unhygienic, not sure if we will fix that for macros 2.0 or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Hopefully we will!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lifetimes are already hygienic in macro macros and with Span::def_site() in proc macros.


To escape an identifier in code, one simply prefixes an identifier with the [sigil] `#`. This changes the syntax context (hygiene) of the identifier from the usual definition-site to the call-site.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quote! macro uses #... Have you considered conflicts if and when quote is redefined as a 2.0 macro?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚲 I wonder if backslash ("escaping") can be valid

pub mod \foo {
    const \BAR: u32 = 123;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril No, I'm not sure. I wonder why it doesn't use $? Grr. Maybe someone can clarify for me whether it would conflict.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an "unresolved question" about this, incidentally.


## Guide: Example A
[guide-example-a]: #guide-example-a

```rust
#![feature(decl_macro)]
#![feature(macro_hygiene_optout)]

macro m() {
pub mod #foo {
pub const #BAR: u32 = 123;
}
}

fn main() {
m!(); // `foo` and `foo::BAR` both behave as if they were defined directly here.
assert_eq!(123, foo::BAR);
}
```

## Guide: Example B
[guide-example-b]: #guide-example-b

```rust
#![feature(decl_macro)]
#![feature(macro_hygiene_optout)]

macro m($mod_name:ident) {
pub mod $mod_name {
pub const #BAR: u32 = 123;
}
}

fn main() {
m!(foo); // `foo` and `foo::BAR` both behave as if they were defined directly here.
assert_eq!(123, foo::BAR);
}
```

## Guide: Example C
[guide-example-c]: #guide-example-c

```rust
#![feature(decl_macro)]
#![feature(macro_hygiene_optout)]

macro m($mod_name:ident) {
pub mod $mod_name {
pub const BAR: u32 = 123;
}
}

fn main() {
m!(foo);
let _ = foo::BAR;
//~^ ERROR cannot find value `BAR` in module `foo`
}
```

## Guide: Example D
[guide-example-d]: #guide-example-d

```rust
#![feature(decl_macro)]
#![feature(macro_hygiene_optout)]

macro m() {
pub mod #foo {
pub const BAR: u32 = 123;
}
}

fn main() {
m!();
let _ = foo::BAR;
//~^ ERROR cannot find value `BAR` in module `foo`
}
```

## Meta-variables
[meta-variables]: #meta-variables

Hygiene escaping of meta-variables (i.e. `#$foo` and `$#foo`) does not have immediately obvious semantics or usefulness, so is explicitly disallowed for the present, and yields error messages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obvious semantics to me is that the resulting identifier takes the name from the metavariable and the hygiene context from the call site.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I really meant the former in't obviously useful, why the latter isn't obviously useful either nor does it have obvious semantics.


## Usage Notes
[usage-notes]: #usage-notes

While the motivation of this feature stems from defining or "exporting" new identifiers from macros to their call-site, where it is appropriate for the macro itself to choose/compute the name, it is clear from the above semantics that this feature allows for other potential uses cases. Most notably, one can use or "import" an identifier from their call-site. This, however, is *not* recommended, since this purpose is already fulfilled well by macro parameters. On the other hand, it is not explicitly disallowed, for two reasons:

- Defining an identifier with call-site hygiene within that macro and then using it is a perfectly reasonable scenario.
- Macro expansion is performed at the syntactical (token stream) level, before parsing, so definitions and uses cannot be easily distinguished.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The macro parser routine first parses the macro definition into a token stream (as before), but now also tags tokens and meta-variables with an enum value representing the kind of hygiene (definition-site or call-site). This is only enabled for new-style `macro!` macros (i.e. *decl_macro* or macros 2.0); for `macro_rules!` macros, the call-site sigil `#` is not handled specially, and gives rise to an error. The sigil is always treated as a separate token outside of macros, on the LHS of macro rules, and when not followed by an identifier on the RHS.

When the macro is invoked (expanded), each token tree is transcribed according to the following rules, depending on its hygiene tag.

- *definition-site*: a normal mark is applied for the current expansion
- *call-site*: a transparent mark is applied for the current expansion and the syntax context for every identifier in the token tree is changed to the syntax context of the call site.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the syntax context for every identifier in the token tree is changed to the syntax context of the call site

What is this part about?
When a macro is expanded, an identifier gets an opaque mark added by default (Span::def_site() in proc macro API) or transparent mark if opt-out is in place (Span::call_site() in proc macro API), that's all what happens.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was slightly confused about how your transparent mark worked. I'll clarify that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if it's better now.


## Reference: Example A
[reference-example-a]: #reference-example-a

In [example A][guide-example-a], the identifiers `foo` (the name of the module) and `BAR` (the name of the constant within the module) are hygiene-escaped, giving them the syntax context of the call site. Thus, `foo::BAR` resolves fine, since `foo` has the same syntax context as the body of the `main` function.

## Reference: Example B
[reference-example-b]: #reference-example-b

In [example B][guide-example-b], the module is named using the identifier passed into the macro, which as a macro argument has the syntax context of the call site. Furthermore, the constant `BAR` within the module is hygiene-escaped, so likewise has the syntax context of the call site. Thus, `foo::BAR` resolves fine, since `foo` has the same syntax context as the body of the `main` function.

## Reference: Example C
[reference-example-c]: #reference-example-c

In [example B][guide-example-b], the situation is similar to [example B][reference-example-b], except that the constant `BAR` is not hygiene-escaped, and thus retains the default definite-site syntaxt context. Thus, when one tries to access `foo::BAR` within the `main` function, `foo` resolves fine, but the constant `BAR` within it is not visible due to hygiene rules, since it does not have a syntax context of the `main` function (or any parent context).

## Reference: Example D
[reference-example-d]: #reference-example-d

In [example B][guide-example-b], the situation is almost identical to [example C][reference-example-c], except that the name of the module is defined within the macro as `foo`, and hygiene-escaped, so that it has the call-site syntax context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here? Should say "In [example D][guide-example-d]"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!


```rust
#![feature(decl_macro)]
#![feature(macro_hygiene_optout)]

macro m() {
pub mod #foo {
pub const BAR: u32 = 123;
}
}

fn main() {
m!();
let _ = foo::BAR;
//~^ ERROR cannot find value `BAR` in module `foo`
}
```

# Drawbacks
[drawbacks]: #drawbacks

- Introducing a new sigil such as `#` can be seen as increasing the syntactical complexity of the language, and potentially obfuscating code slightly.
- The ability to mark some occurences of an identifier with call-site hygiene and leave others with default definition-site hygiene is perhaps more fine-grained than necessary.
- It is not immediately obvious from a macro definition which (occurences of) identifiers take their syntax context from the call site. One has to read through the whole definition to figure it out.
- The syntax permits marking identifiers with call-site hygiene purely for "use" or "import" scenarios (as opposed to "defining" or "exporting" scenarios). Parameters are intended for this purpose, and accomplish the task much better, since they self-document uses of identifiers. However, this ability may actually be desirable more than problematic, as mentioned in the [usage notes][usage-notes].

# Rationale and alternatives
[alternatives]: #alternatives

The design in this RFC was chosen because of its simple syntax and semantics, and the fact it offers a good way to get experience with hygiene opt-out in general, due to its fine-grainedness.

The main alternative considered was having an `escapes` attribute for macros and not using a sigil.

```rust
#[escapes(S, T)]
macro m() {
struct S; // Defines `S` at the call-site.
T // Resolves at the call-site.
}
```

The above would then be equivalent to the following, using the sigil syntax.

```rust
macro m() {
struct #S; // Defines `S` at the call-site.
#T // Resolves at the call-site.
}
```

The obvious benefit of this is that is manifest which identifiers (`S` and `T` in the above example) are hygiene-escaped. A downside, which may or may not be significant, is that these identifiers are then *always* escaped within the macro definition, and thus can never be used with definition-site hygiene.

Going beyond a single `escapes` attribute, one can also imagine having two separate attributes: `defines`, for defining (exporting) identifiers, and `uses`, for using (importing) identifiers. The main issue here is the complexity of the semantics and implementation; indeed, it is not even clear whether one could clearly demarcate cases of definition and use at the syntactical level. As implied by the [usage notes][usage-notes], however, the `uses` attribute would largely overlap with the purpose of macro parameters.

In the end, the approach taken by this RFC was chosen due to the fact it has the most prior art, including an [existing working implementation][pr-47992]. It is also the most flexible in that it allows different hygiene to be applied to different *occurrences* of the same identifier. This will allow us to learn more about the use of hygiene opt-out in practice, while the feature is unstable.

# Prior art
[prior-art]: #prior-art

Extended discussion on this subject was carried out in a [pull request][pr-47992] for this feature, which was closed due to the decision that an RFC such as this one be accepted first. [Alternatives][pr-47992-alternatives] were originally evaluated there, with discussion initiated by @jseyfried, and [continued][pr-47992-alternatives-eval] by @petrochenkov.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect some discussion of how this works in other languages here. In particular, Scheme has a rich system for doing this sort of thing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'd like to avoid learning Scheme properly for this... maybe I can dig up a decent explanation somewhere?


Further back, the initial sigil syntax was mentioned in [this comment][pr-40848-comment], and some discussion occrred in the [declarative macros 2.0 tracking issue][decl-macro].

# Unresolved questions
[unresolved]: #unresolved-questions

- Do we want to somehow disallow pure importing of identifiers within macros aside from via parameters, as mentioned in the [drawbacks] section?
- Do we also want to implement the attribute-based approach as an alternative or in addition to the sigil-based approach?

[sigil]: https://en.wikipedia.org/wiki/Sigil_(computer_programming)
[hygienic macros]: https://doc.rust-lang.org/1.7.0/book/macros.html#hygiene

[decl-macro]: https://github.com/rust-lang/rust/issues/39412
[pr-40848-comment]: https://github.com/rust-lang/rust/pull/40847#issuecomment-291186518
[pr-47992]: https://github.com/rust-lang/rust/pull/47992
[pr-47992-alternatives]: https://github.com/rust-lang/rust/pull/47992#issuecomment-364729651
[pr-47992-alternatives-eval]: https://github.com/rust-lang/rust/pull/47992#issuecomment-370268136