Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: label-break-value #2046

Merged
merged 7 commits into from
Feb 27, 2018
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions text/0000-label-break-value.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
- Feature Name: label_break_value
- Start Date: 2017-06-26
- RFC PR: [#2046](https://github.com/rust-lang/rfcs/pull/2046)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Allow a `break` of labelled blocks with no loop, which can carry a value.

# Motivation
[motivation]: #motivation

In its simplest form, this allows you to terminate a block early, the same way that `return` allows you to terminate a function early.

```
'block: {
do_thing();
if condition_not_met() {
break 'block;
}
do_next_thing();
if condition_not_met() {
break 'block;
}
do_last_thing();
}
```
In the same manner as `return` and the labelled loop breaks in [RFC 1624](https://github.com/rust-lang/rfcs/blob/master/text/1624-loop-break-value.md), this `break` can carry a value:
```
let result = 'block: {
if foo() { break 'block 1; }
if bar() { break 'block 2; }
3
};
```
RFC 1624 opted not to allow options to be returned from `for` or `while` loops, since no good option could be found for the syntax, and it was hard to do it in a natural way. This proposal gives us a natural way to handle such loops with no changes to their syntax:
```
let result = 'block: {
for &v in container.iter() {
if v > 0 { break 'block v; }
}
0
};
```
This extension handles searches more complex than loops in the same way:
```
let result = 'block: {
for &v in first_container.iter() {
if v > 0 { break 'block v; }
}
for &v in second_container.iter() {
if v < 0 { break 'block v; }
}
0
};
```
Implementing this without a labelled break is much less clear:
Copy link
Contributor

@withoutboats withoutboats Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this also be written?

first_container.iter().find(|v| v > 0)
    .or_else(|| second_container.iter().find(|v| v < 0 ))
    .unwrap_or(0)

Copy link
Member

@joshtriplett joshtriplett Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, for that matter

first_container.iter().chain(second_container.iter()).find(|v| v > 0).unwrap_or(0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They use different find functions, actually

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withoutboats: That's a lot neater than the iterator based solution I proposed.

joshtriplett: the find condition is different for the two iterators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if you don't like or_else chains:

let x = if let Some(x) = first_container.iter().find(|v| v > 0) { x }
    else if let Some(x) = second_container.iter().find(|v| v < 0) { x }
    else { 0 };

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, all these closure-based solutions have trouble if there is a return or ? also inside the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@withoutboats @ciphergoth 🤦‍, sorry, I missed that.

```
let mut result = None;
for &v in first_container.iter() {
if v > 0 {
result = Some(v);
break;
}
}
if result.is_none() {
for &v in second_container.iter() {
if v < 0 {
result = Some(v);
break;
}
}
}
let result = result.unwrap_or(0);
```

# Detailed design
[design]: #detailed-design
```
'BLOCK_LABEL: { EXPR }
```
would simply be syntactic sugar for
```
'BLOCK_LABEL: loop { break { EXPR } }
```
except that unlabelled `break`s or `continue`s which would bind to the implicit `loop` are forbidden inside the *EXPR*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was worried about forbidding unlabelled breaks or continues, since it would mean that adding a fresh block label to existing code could cause that code to stop compiling, like so:

Old code that compiles:

loop {
    {  
        let raii_thing = Destro::new();
        if gi_joe() { break; }
    }
    println!("no more Destro here!");
}

New code that would not compile:

loop {
    'fresh_unused_label_for_expository_text_describing_raii: {
        let raii_thing = Destro::new();
        if gi_joe() { break; }
        //~^ ERROR: unlabelled break within labelled block with no intervening loop syntax.
    }
    println!("no more Destro here!");
}

However, after a bit of thought (and reading over the alternatives section that explicitly explains why this RFC is choosing this conservative approach), I decided that forbidding unlabelled breaks/continues is acceptable as long as the rustc error messages suggest to the user that they explicitly label their break target in such a case.

E.g., I want the compiler, when given the above code that fails to compile, to suggest to the programmer that they write something analogous to the following:

'escape_hatch: loop {
    'fresh_unused_label_for_expository_text_describing_raii: {
        let raii_thing = Destro::new();
        if gi_joe() { break 'escape_hatch; }
    }
    println!("no more Destro here!");
}

I suspect a high quality error message in such cases will required for the success of this feature, so I'd go so far as to say that the RFC itself should mention it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've had a go at adding some language, let me know what you think. Note that I'm offline from Sunday 27th to Tuesday 5th.


This is perhaps not a conceptually simpler thing, but it has the advantage that all of the wrinkles are already well understood as a result of the work that went into RFC 1624. If *EXPR* contains explicit `break` statements as well as the implicit one, the compiler must be able to infer a single concrete type from the expressions in all of these `break` statements, including the whole of *EXPR*; this concrete type will be the type of the expression that the labelled block represents.

# How We Teach This
[how-we-teach-this]: #how-we-teach-this

This can be taught alongside loop-based examples of labelled breaks.

# Drawbacks
[drawbacks]: #drawbacks

The proposal adds new syntax to blocks, requiring updates to parsers and possibly syntax highlighters.

# Alternatives
[alternatives]: #alternatives

Everything that can be done with this feature can be done without it. However in my own code, I often find myself breaking something out into a function simply in order to return early, and the accompanying verbosity of passing parameters and return values with full type signatures is a real cost.

Another alternative would be to revisit one of the proposals to add syntax to `for` and `while`.

We have three options for handling an unlabelled `break` or `continue` inside a labelled block:

- compile error on both `break` and `continue`
- bind `break` to the labelled block, compile error on `continue`
- bind `break` and `continue` through the labelled block to a containing `loop`/`while`/`for`

This RFC chooses the first option since it's the most conservative, in that it would be possible to switch to a different behaviour later without breaking working programs. The second is the simplest, but makes a large difference between labelled and unlabelled blocks, and means that a label might be used even when it's never referred to. The third is consistent with unlabelled blocks and with Java, but seems like a rich potential source of confusion.

# Unresolved questions
[unresolved]: #unresolved-questions

None outstanding that I know about.