Skip to content
This repository was archived by the owner on Nov 9, 2025. It is now read-only.

Conversation

@ammkrn
Copy link
Contributor

@ammkrn ammkrn commented Jun 19, 2021

It's not uncommon for EBNF (or just patterns using | for alternation)
to allow a leading pipe. Ungram currently does not allow this; the
current behavior is to fail and return a generic parse error.

This change allows the use of a leading pipe for rules that have
more than one production/RHS. For example, this would now be valid:

A ::=
  | 'tok1'
  | 'tok2'

@ammkrn
Copy link
Contributor Author

ammkrn commented Jun 19, 2021

A side effect is that

AssocTypeArg =
  NameRef ( | ':' TypeBoundList | '=' Type)

would also be allowed. I think this is fairly benign, but preventing is an easy fix if this is undesirable.

@matklad
Copy link
Contributor

matklad commented Jun 19, 2021

I'd rather not do this -- one of the (informal) design values for ungrammar is to be tight rather than loose. For example, we use a very tight definition for whitespace and idents:

fn is_whitespace(c: char) -> bool {
    matches!(c, ' ' | '\t' | '\n')
}
fn is_ident_char(c: char) -> bool {
    matches!(c, 'a'..='z' | 'A'..='Z' | '_')
}

So, having "optional" elements doesn't sit well with the current design. It's not that there's something objectively wrong with allowing leading | -- it's fine! It's just that, overall, the current philosophy is that there should be only one way to do it.

@bjorn3
Copy link

bjorn3 commented Jun 19, 2021

Can we require the leading pipe then?

@matklad
Copy link
Contributor

matklad commented Jun 19, 2021

That creates another false choice: a and | a mean the same thing. More practically, leading | was allowed in Rust at some point, but I've almost never seen people actually taking advantage of that

@ammkrn
Copy link
Contributor Author

ammkrn commented Jun 19, 2021

I'll settle for a more detailed error message:

fn rule(p: &mut Parser) -> Result<Rule> {
    if let Some(lexer::Token { kind: TokenKind::Pipe, loc }) = p.peek() {
        bail!(
            *loc, 
            "The first element in a sequence of productions or alternatives \
            must not have a leading pipe (`|`)"
        );
    }
   ...
}

@matklad
Copy link
Contributor

matklad commented Jun 19, 2021

That would be lovely!

@ammkrn ammkrn closed this Jun 19, 2021
@ammkrn ammkrn deleted the leading_pipe branch June 19, 2021 16:41
bors bot added a commit that referenced this pull request Jun 19, 2021
35: More specific error for leading pipes r=matklad a=ammkrn

Re: #34; return a more detailed error message when counting an improperly placed leading pipe.

Co-authored-by: ammkrn <ammkrn@tuta.io>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants