Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Accept semicolons as item-like #2479

Closed
wants to merge 12 commits into from

Conversation

Centril
Copy link
Contributor

@Centril Centril commented Jun 16, 2018

๐Ÿ–ผ๏ธ Rendered

๐Ÿ“ Summary

  • The semicolon (;) is now legal, but not recommended, in areas where items are allowed, permitting a user to write struct Foo {};, among other things.
  • To retain a recommended and uniform style, the tool rustfmt will remove any extraneous ;.
  • Furthermore, the compiler will fire a warn-by-default lint when extraneous ; are encountered, whether they be inside or outside an fn body.

๐Ÿ’– Thanks

To @ExpHP who co-authored this RFC with me.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jun 16, 2018
@clarfonthey
Copy link
Contributor

Another alternative is to allow the parser to accept ; as an item, while still making having one be a hard error. This happens with a number of different syntax errors so that the compiler can report all errors at once, rather than stopping immediately.

@Centril
Copy link
Contributor Author

Centril commented Jun 17, 2018

@clarcharr I think that the proposal in the RFC is better, but I've made a note of that alternative in the RFC. Thank you for mentioning it.

@petrochenkov
Copy link
Contributor

petrochenkov commented Jun 17, 2018

IIRC, the situation with ; in the parser was kinda mess, especially when macros are involved, we still have some compatibility warnings in this area.

There's one thing that's pretty common (for C reasons) and must work:

if condition {
    ; // Do nothing
}

, and it already works, but I'm not sure how exactly it's treated by parser and especially macros, because it's not treated as a statement.
Perhaps we need to treat it as a statement consistently (e.g. accept ; as a stmt in macros), but I'm not sure right now what complications may arise with macros.

Regarding semicolon freely floating among items in modules, I think usual error recovery (report an error, skip the semicolon, continue parsing) would be enough.
The motivation about leftover semicolons looks especially wrong to me, I want such mistakes to be caught immediately by the compiler and not left in the code.

@Centril
Copy link
Contributor Author

Centril commented Jun 17, 2018

@petrochenkov

The motivation about leftover semicolons looks especially wrong to me, I want such mistakes to be caught immediately by the compiler and not left in the code.

Why? To me, a left-over semicolon does not usually signify a logic bug that needs to be caught immediately, but rather a trivial style mistake that can easily be stripped with rustfmt.

A third alternative is to allow it in the language but fire a warn-by-default lint; that satisfies your "needs to be caught immediately" condition.

Also, I think this RFC is in line with the general policy on language grammar as outlined by @withoutboats.

@mark-i-m
Copy link
Member

I think the RFC is well written, but I tend to favor "Make struct F {}; a non-fatal hard error". I simply don't come across this problem that often, and I feel like it will lead to widespread bad style, despite the proposed rustfmt change.

My feelings are not very strong though...

@Centril
Copy link
Contributor Author

Centril commented Jun 17, 2018

@mark-i-m

I think the RFC is well written

Thanks :)

I feel like it will lead to widespread bad style, despite the proposed rustfmt change.

Would a warn-by-default lint, as a more aggressive mechanism, satisfy your concern?

EDIT: Since #1925 was merged, I haven't seen any widespread use of that syntax, widely considered bad style, and rustfmt is/will also format(ting) that away. rustfmt is atm proposed as a mechanism because bad style is usually not linted, just formatted away.

@mark-i-m
Copy link
Member

Would a warn-by-default lint, as a more aggressive mechanism, satisfy your concern?

I think so.

EDIT: Since #1925 was merged, I haven't seen any widespread use of that syntax, widely considered bad style, and rustfmt is/will also format(ting) that away. rustfmt is atm proposed as a mechanism because bad style is usually not linted, just formatted away.

In this case, I think the risk is much higher because the semicolon in class/struct/enum Foo {...}; is required c/c++ syntax.

@Ixrec
Copy link
Contributor

Ixrec commented Jun 17, 2018

@mark-i-m I thought "a non-fatal hard error" was a contradiction; what did you mean by that? A deny-by-default lint? A hard error that the compiler "understands" well enough that it can still produce warnings/errors/etc for all the code after the superfluous semicolon?


Personally, the RFC as-written is almost what I'd want, but I'd also add a rustc warn-by-default lint against superfluous semicolons. Today, Rust technically has pretty consistent semicolon rules, which only appear inconsistent in practice because there's nothing loudly discouraging superfluous semicolons. I can see the argument that rustfmt is sufficient, but to me this feels very similar to the existing rustc lints against unusued variables, unused mut, dead code, etc.

@Centril
Copy link
Contributor Author

Centril commented Jun 17, 2018

@mark-i-m

I think so.

Great; I'll add it to the list of alternatives for now in a bit. I wouldn't mind changing to it as the main proposal. Let's see what @ExpHP says.

In this case, I think the risk is much higher because the semicolon in class/struct/enum Foo {...}; is required c/c++ syntax.

Fair enough, it does seem like a reasonable inference. I have two points regarding this:

  1. Would it be so bad? I mean, struct Foo { .. }; is a bit ugly in my subjective taste, but I can't say it is less legible than struct Foo { .. } without the semicolon at the end. I certainly wouldn't lose any sleep over such syntax being committed into VCS or be irked at all, even tho I wouldn't write like that personally.
  2. Being required C/C++ syntax almost sounds like an argument for just dealing with this with rustfmt, thereby reducing speedbumps for C/C++ programmers (a group of people we are inclined to bribe with nice things?).

@Ixrec

but to me this feels very similar to the existing rustc lints against unusued variables, unused mut, dead code, etc.

That's a nice rationale :)

@Ixrec
Copy link
Contributor

Ixrec commented Jun 17, 2018

Being required C/C++ syntax almost sounds like an argument for just dealing with this with rustfmt, thereby reducing speedbumps for C/C++ programmers (a group of people we are inclined to bribe with nice things?).

Personally, I constantly misremember the semicolon rules in both C++ and Javascript, despite working on one or the other every single day. So to me, "other languages are different" is a good argument for making it not a hard error (because it does reduce speedbumps), but not an argument for making it rustfmt-only (because I make these mistakes out of ignorance or indifference; not because I have an aesthetic preference for any one language's semicolon rules). Though I have no idea how true that is of everyone else.

@ExpHP
Copy link

ExpHP commented Jun 17, 2018

Right now I'm bothered by what @petrochenkov said: (emphasis mine)

There's one thing that's pretty common (for C reasons) and must work:

if condition {
    ; // Do nothing
}

, and it already works, but I'm not sure how exactly it's treated by parser and especially macros, because it's not treated as a statement.

Yikes! And indeed, this is visibly reflected in the macro_rules matchers. I assumed that stmt matches ;, but it does not:

   Compiling playground v0.0.1 (file:///playground)
error: expected a statement
  --> src/main.rs:14:7
   |
14 | i3! { ; }
   |       ^

Perhaps we should not formulate this as "making ; an item"...?

@ExpHP
Copy link

ExpHP commented Jun 17, 2018

@Centril @mark-i-m I feel that a warn-by-default lint does adequately address the motivations, and would not mind making it the main proposal.

@mark-i-m
Copy link
Member

I thought "a non-fatal hard error" was a contradiction; what did you mean by that? A deny-by-default lint?

As I understand it, there are two ways that part of the compiler can report an error. A fatal error immediately ends compilation, whereas a non-fatal error allows the compilation to proceed a bit further so we can gather more errors.

I think the the warn by default lint is a better solution, though. It seems more in line with things like warning for superfluous parens on if conditions.

Would it be so bad?

It is a personal taste thing for me too. I think the warn by default lint removes the papercuts for c/c++ programmers while encouraging good style.

@withoutboats
Copy link
Contributor

withoutboats commented Jun 17, 2018

Does this make this a valid Rust program?

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

There's precedent for allowing optional ; after items both in how we allow ; optionally after control flow blocks & in how (I believe) we decided to allow , optionally after block match arms.

@scottmcm
Copy link
Member

I'm very good at accidentally putting extra semicolons after braced structs, so I'm sympathetic here, but it's hard for me to judge what, if anything, should be done. I like the discussion so far.

Personally, I've been pretty happy since the error changed from "unexpected token" to

error: expected item, found `;`
 --> src/lib.rs:3:22
  |
3 | struct Foo { x: u32 };
  |                      ^ help: consider removing this semicolon

@mark-i-m
Copy link
Member

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

I was actually thinking about this too. I think @withoutboats has a good point. In addition too their point, though, I think making ; a whole item would make life harder for macro authors who would now need to check for and handle another type of item.

By making ; an optional suffix onto any of the existing item types, I think we can avoid both problems.

@mark-i-m
Copy link
Member

;;;;;;;;;;;;;
fn main() { }

Oh man! This foils my plans to propose ;;;;;;;;;;;;; as shorthand for fn main ๐Ÿ˜ˆ

@mark-i-m
Copy link
Member

One other thing: This seems like a less likely error, but I think the following mistake becomes possible all of a sudden:

#[my_attr]; // oops
fn foo() { ... }

@est31
Copy link
Member

est31 commented Jun 18, 2018

In if clauses, we do lint on extraneous () in the condition:

if (true) { //~ WARNING unnecessary parentheses around `if` condition
}

The situation between the two is very similar. Cpp requires ; after struct declaration as well as () around the if condition.

In fact I as a programmer coming from Cpp got so annoyed by that warning that I allowed it. So even though I'm mildly against this RFC now, I guess I'd have liked back when I was fresh from Cpp.

Given the precedent, I think that a warn by default lint as @Centril suggested would be a good idea.
As for superfluous commata, I think you should still be allowed to type them, for consistency. E.g. here they seem useful:

match v {
    Enum::Foo => panic!(),
    Enum::Baz => {
        let k = boo();
        k.foo();
    }, // ~ USEFUL this comma is good ๐Ÿ˜Ž
    Enum::Bar(b) => bar(b),
}

@comex
Copy link

comex commented Jun 18, 2018

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Not that extreme, comparatively speaking. This is perfectly valid in C and C++:

;;;;;;;;;;;;;
int main() { }

(Since I was curious: Extraneous semicolons are also valid in Java, JS (of course), Haskell, and Ruby, but not in C#, Swift, or Python. In Go they are allowed only within functions.)

@Centril
Copy link
Contributor Author

Centril commented Jun 18, 2018

@withoutboats

Does this make this a valid Rust program?

;;;;;;;;;;;;;
fn main() { }

Thats a bit more extreme that our grammar flexibility usually goes.

Yep, it does. But it also seems extremely ๐Ÿ˜† unlikely for someone to actually write that ;)
Let's not forget that something like the snippet above is permitted if placed inside a function (because ; is interpreted as () then..)

fn main() {
    ;;;;;;;;;;;;; // <-- interestingly, there is no lint here... but rustfmt will strip it
    fn foo() { }
}

So... still that extreme?

Why not allow an optional ; after all items (except those that require ;)? Why go all the way to making ; its own item?

It seems like the simpler alternative to make ; its own item. But if we at the conclusion of this RFC decide to allow ; after all items (except ...) I will be equally happy.

@mark-i-m

One other thing: This seems like a less likely error, but I think the following mistake becomes possible all of a sudden:

#[my_attr]; // oops
fn foo() { ... }

Yes good point; I think it would be caught by a warn-by-default lint well tho :)

@est31

Given the precedent, I think that a warn by default lint as @Centril suggested would be a good idea.

Yes, I think so. Let's switch over to a warn by default lint as the main proposal (in addition to rustfmt formatting which solves the lint for the user instantly).

So... what should the name of the lint be? I see two categories of options here:

  1. A new lint name, like: redundant_semicolon -- the main benefit here is that you can allow or deny it separately without allowing something else.
  2. Add it to some existing lint -- not sure which one... ideas?

@petrochenkov
Copy link
Contributor

I still don't see "positive"/"constructive" motivation for this change, as opposed to "let's accept some more syntax because we can".
Trailing separators are "positively" useful - they support uniform code generation so the last element is not treated specially, optional semicolons only make life harder for macros as cases like rust-lang/rust#34660 show.
Trailing separators minimize diffs when new elements are added, these semicolons have no such effect.

In addition to these semicolons we have a ton of cases in the parser when we can recover from a missing or an extraneous token and continue compilation. Should we turn them into warnings as well?
Maybe instead of doing recovery in the compiler we should adopt the bitwise negation operator ~ into the proper language as a synonym for ! because someone could copypaste code from C?
I don't think so, recovery is enough. And we shouldn't move these semicolons from recovery into proper language either, they are not special.

@Ixrec
Copy link
Contributor

Ixrec commented Jun 18, 2018

I am primarily interested in getting a warn-by-default lint on the redundant semicolons that we already allow today. Allowing them in more places seemed like a minor win for consistency and arguably removing a speedbump when refactoring, but I'm fine if macro compatibility concerns override that.

@Centril
Copy link
Contributor Author

Centril commented Jun 18, 2018

I still don't see "positive"/"constructive" motivation for this change, as opposed to "let's accept some more syntax because we can".

The motivation is clearly not to accept more syntax just because we can. The positive motivation is to avoid disturbing the writing flow of some users due to frequently made trivial mistakes. You might not agree with it it, but it is a positive motivation.

optional semicolons only make life harder for macros as cases like rust-lang/rust#34660 show.

What specific cases are you worried about wrt. making ; an item?

In addition to these semicolons we have a ton of cases in the parser when we can recover from a missing or an extraneous token and continue compilation. Should we turn them into warnings as well?

Are they frequently made trivial mistakes with precedent from other places?

Maybe instead of doing recovery in the compiler we should adopt the bitwise negation operator ~ into the proper language as a synonym for ! because someone could copypaste code from C?

I'll let whoever wants to do that make the case for it, but I won't. Redundant ; are already allowed inside fn bodies and tuple / unit structs already allow them. The precedent from C/C++ wrt. structs is also much more common than ~. I think there's much more added precedent in this case.

I don't think so, recovery is enough.

Not to me. By emitting a warning instead of a hard error, the code can be compiled and tests can be run, so it helps retain flow a lot more in my opinion.

@petrochenkov
Copy link
Contributor

petrochenkov commented Jun 18, 2018

@Centril

The positive motivation is to avoid disturbing the writing flow of some users due to frequently made trivial mistakes.

If this is a problem that needs solution, then we can have a compiler mode similar to gcc's -fpermissive that treats *all* "sufficiently recoverable" errors as warnings and continues compilation until final lib/exe is produced, rather than target a single special case of mistakes.

What specific cases are you worried about wrt. making ; an item?

I don't remember right away, need to investigate a bit.
The primary source of complications is that both macro invocation (m!(...);) and macro output may have semicolons, but ; is not technically a part of macro invocation syntax, and it doesn't belong to the macro output either, so some rules need to be established with regards to what inner or outer semicolons are attached to what and when they are ignored or not ignored.

@est31
Copy link
Member

est31 commented Jun 18, 2018

Regarding the redundant semicolons lint, I think we should avoid linting in this case:

fn foo() -> Foo {
    return Foo;
}

The ; can technically be removed, but many (most?) people still write it.

@Centril
Copy link
Contributor Author

Centril commented Jan 12, 2019

@rfcbot cancel
@rfcbot postpone

Let's postpone this for now, try better parser recovery, and see how things turn out.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 12, 2019

@Centril proposal cancelled.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 12, 2019

Team member @Centril has proposed to postpone this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-postpone This RFC is in PFCP or FCP with a disposition to postpone it. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Jan 12, 2019
@graydon
Copy link

graydon commented Jan 12, 2019

Oppose. The rationale is weak and the alternative of "do nothing" is sufficient. Also every change to the semicolon rules has always had unanticipated effects. They are fragile and a bad idea to perturb.

@Manishearth
Copy link
Member

Manishearth commented Jan 12, 2019

Fwiw from the POV of parser recovery I've always kinda wanted an interactive rustc mode where it makes assumptions about your code and continues to compile in the background, letting you validate the assumptions in a prompt (which also basically applies the suggestion to your code, rustfix-style)

Rust is rife with cases where it refuses to compile something where there's one very likely option (that it tells you about!), I think most of my time writing rust is spent dealing with these after I write a large block of code. Item semicolons and turbofish are cases that newcomers tend to hit, but I still hit cases like import errors and block semicolons.

It would be nice if rustc could go "hey, did you mean to do X here?", and silently continue with compilation in the background, assuming X. If you accept the prompt, it edits your code. If you reject it, it errors out. This could apply for obvious parse bugs, but it could also apply for missing imports and a subset of type errors.

@Centril
Copy link
Contributor Author

Centril commented Jan 12, 2019

@Manishearth That is pretty interesting, perhaps it could be coupled with a good REPL (which feels needed for other reasons...)?

@alercah
Copy link
Contributor

alercah commented Jan 12, 2019

I really love this idea, as someone who uses primarily vim right now for Rust so I don't get good realtime feedback IDE-style.

@clarfonthey
Copy link
Contributor

Sounds to me like running rustfix more interactively, and combining suggestions that aren't necessarily right into a sort of rustguess.

Centril added a commit to Centril/rust that referenced this pull request Jan 14, 2019
Centril added a commit to Centril/rust that referenced this pull request Jan 14, 2019
@rfcbot
Copy link
Collaborator

rfcbot commented Feb 5, 2019

๐Ÿ”” This is now entering its final comment period, as per the review above. ๐Ÿ””

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Feb 5, 2019
@ErichDonGubler
Copy link

Has anybody verified the question the first question still listed to to-resolve?

Does this create parsing ambiguities anywhere?

I understand if we're not sure at this point and that we need might want to do something like a crater run to supplement theory with empirical data, just curious if anybody knew the answer yet.

@Centril
Copy link
Contributor Author

Centril commented Feb 13, 2019

@ErichDonGubler I haven't looked it in a while, but it's unlikely that it would create any ambiguities.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Feb 15, 2019
@rfcbot
Copy link
Collaborator

rfcbot commented Feb 15, 2019

The final comment period, with a disposition to postpone, as per the review above, is now complete.

By the power vested in me by Rust, I hereby postpone this RFC.

@rfcbot rfcbot added postponed RFCs that have been postponed and may be revisited at a later time. and removed disposition-postpone This RFC is in PFCP or FCP with a disposition to postpone it. labels Feb 15, 2019
@rfcbot rfcbot closed this Feb 15, 2019
@petrochenkov
Copy link
Contributor

Something like this may become necessary if we want to complete the transition of macro expansion to the token stream model.

Consider these two macros.

macro item() { struct S {} }
macro expr() { 0 }

Tokens produced by these macros are struct S {} and 0 respectively.

In the tokens stream model macro expansion replaces macro invocations with tokens produced by them.
But what is a macro invocation exactly?
Does it include the possible semicolon after foo!(...)?

// Version 1: Include semicolon.

item!();

fn f() { expr!(); }

=> expansion =>

// OK
struct S {}

// Not OK, breaking change.
// A statement turned into an expression resulting in type error.
// Not fixable.
fn f() { 0 }
// Version 2: Do not include semicolon.

item!();

fn f() { expr!(); }

=> expansion =>

// Not OK, breaking change.
// An extra semicolon at module level.
// Fixable by something like this RFC.
struct S {};

// OK
fn f() { 0; }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-syntax Syntax related proposals & ideas finished-final-comment-period The final comment period is finished for this RFC. postponed RFCs that have been postponed and may be revisited at a later time. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.