Skip to content

Make a spec-looking reference implementation of outer grammar. #4965

Open
@phadej

Description

@phadej
  • The grammar is written in

    -- $grammar
    --
    -- @
    -- CabalStyleFile ::= SecElems
    --
    -- SecElems ::= SecElem* '\n'?
    -- SecElem ::= '\n' SecElemLayout | SecElemBraces
    -- SecElemLayout ::= FieldLayout | FieldBraces | SectionLayout | SectionBraces
    -- SecElemBraces ::= FieldInline | FieldBraces | SectionBraces
    -- FieldLayout ::= name ':' line? ('\n' line)*
    -- FieldBraces ::= name ':' '\n'? '{' content '}'
    -- FieldInline ::= name ':' content
    -- SectionLayout ::= name arg* SecElems
    -- SectionBraces ::= name arg* '\n'? '{' SecElems '}'
    -- @
    --
    -- and the same thing but left factored...
    --
    -- @
    -- SecElems ::= SecElem*
    -- SecElem ::= '\n' name SecElemLayout
    -- | name SecElemBraces
    -- SecElemLayout ::= ':' FieldLayoutOrBraces
    -- | arg* SectionLayoutOrBraces
    -- FieldLayoutOrBraces ::= '\n'? '{' content '}'
    -- | line? ('\n' line)*
    -- SectionLayoutOrBraces ::= '\n'? '{' SecElems '\n'? '}'
    -- | SecElems
    -- SecElemBraces ::= ':' FieldInlineOrBraces
    -- | arg* '\n'? '{' SecElems '\n'? '}'
    -- FieldInlineOrBraces ::= '\n'? '{' content '}'
    -- | content
    -- @
    --
    -- Note how we have several productions with the sequence:
    --
    -- > '\n'? '{'
    --
    -- That is, an optional newline (and indent) followed by a @{@ token.
    -- In the @SectionLayoutOrBraces@ case you can see that this makes it
    -- not fully left factored (because @SecElems@ can start with a @\n@).
    -- Fully left factoring here would be ugly, and though we could use a
    -- lookahead of two tokens to resolve the alternatives, we can't
    -- conveniently use Parsec's 'try' here to get a lookahead of only two.
    -- So instead we deal with this case in the lexer by making a line
    -- where the first non-space is @{@ lex as just the @{@ token, without
    -- the usual indent token. Then in the parser we can resolve everything
    -- with just one token of lookahead and so without using 'try'.

  • This is better done as a separate package in this repo cabal-trifecta or cabal-megaparsec, as there are technical limitations of having this as test-suite in Cabal package.

cc @sboosali

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions