Correctly parse ado without any statements in it #46

MonoidMusician · 2023-02-22T00:28:48Z

The issue was that the parser wasn't backtracking enough to abort valueParser and fall back to pure [], so instead I parse the layout tokens up front so we can commit to the right path early. Perhaps it could be more elegant.

Tested on the examples in the issue and on our work codebase. Do you want me to add a test to test/Main.purs?

natefaubion · 2023-02-22T01:16:56Z

src/PureScript/CST/Parser.purs

  where
-  values = (go =<< valueParser) <|> pure []


This seems like it should be equivalent to optional? If we had an empty layout, then valueParser should immediately fail if given a TokLayoutEnd, shouldn't it? It's not obvious to me what "backtracking far enough" means then, and that there isn't some other internal bug. Do we know where backtracking is halting?

Okay, I was a little hasty in ascribing it to backtracking. Putting try in there does not help.

What's really messing it up is the indentation-based parser recovery when we call it in parseAdo (this is the only usage of layout). An an example change, this manages to parse the empty ado in:

--- a/src/PureScript/CST/Parser.purs +++ b/src/PureScript/CST/Parser.purs @@ -673,7 +673,7 @@ parseDo = do parseAdo :: Parser (Recovered Expr) parseAdo = do keyword <- tokQualifiedKeyword "ado" - statements <- layout (recoverDoStatement parseDoStatement) + statements <- layout parseDoStatement in_ <- tokKeyword "in" result <- parseExpr pure $ ExprAdo { keyword, statements, in: in_, result }

It makes sense to me that there is something odd with recovery, but I'm not sure what. In the case of ado in, I would expect recoverIndent to immediately hit a TokLayoutEnd where col == indent, resulting in empty tokens, yielding Nothing, which would propagate the error.

purescript-language-cst-parser/src/PureScript/CST/Parser.purs

Lines 1155 to 1164 in 5afff30

let

Tuple tokens newStream = recoverTokensWhile

( \tok indent -> case tok.value of

TokLayoutEnd col -> col > indent

TokLayoutSep col -> col > indent

_ -> true

)

stream

if Array.null tokens then

Nothing

purescript-language-cst-parser/src/PureScript/CST/Parser/Monad.purs

Lines 149 to 151 in 5afff30

case k error state1.stream of

Nothing ->

runFn2 resume (state2 { consumed = state1.consumed }) error

Maybe something is wrong with the state threading in recover. The logic of the original code seems OK to me, which makes me think this is an internal bug and not related to the specific formulation of this parser.

Okay, I'll push a test for this case and dig deeper.

natefaubion · 2023-02-22T01:17:55Z

I think that this should have a regression test, yes. We test against publicly available code, but this was triggered by an edge case that doesn't exist in any public code.

src/PureScript/CST/Parser.purs

natefaubion · 2023-03-05T16:19:52Z

Thanks!

With the fixes to indentation recovery in #63, the workaround for empty ado parsing is no longer necessary. This effectively reverts #46.

Correctly parse ado without any statements in it

a517079

natefaubion reviewed Feb 22, 2023

View reviewed changes

MonoidMusician added 2 commits February 24, 2023 11:57

Tests for ado/in, with empty cases and recovery

cfb0311

Inline tweaked layout helper with comments explaining why

707e8de

natefaubion reviewed Mar 5, 2023

View reviewed changes

src/PureScript/CST/Parser.purs Outdated Show resolved Hide resolved

Update src/PureScript/CST/Parser.purs

0141b02

natefaubion merged commit a7ca658 into natefaubion:main Mar 5, 2023

MonoidMusician deleted the muchadoaboutnothing branch March 5, 2023 18:14

natefaubion added a commit that referenced this pull request May 26, 2025

Cleanup ado parsing

9794747

With the fixes to indentation recovery in #63, the workaround for empty ado parsing is no longer necessary. This effectively reverts #46.

natefaubion mentioned this pull request May 26, 2025

Cleanup ado parsing #64

Merged

natefaubion added a commit that referenced this pull request May 26, 2025

Cleanup ado parsing

91beb67

With the fixes to indentation recovery in #63, the workaround for empty ado parsing is no longer necessary. This effectively reverts #46.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correctly parse ado without any statements in it #46

Correctly parse ado without any statements in it #46

Uh oh!

MonoidMusician commented Feb 22, 2023

Uh oh!

natefaubion Feb 22, 2023

Uh oh!

MonoidMusician Feb 22, 2023

Uh oh!

natefaubion Feb 23, 2023 •

edited

Loading

Uh oh!

MonoidMusician Feb 23, 2023

Uh oh!

natefaubion commented Feb 22, 2023

Uh oh!

Uh oh!

natefaubion commented Mar 5, 2023

Uh oh!

Uh oh!

	let
	Tuple tokens newStream = recoverTokensWhile
	( \tok indent -> case tok.value of
	TokLayoutEnd col -> col > indent
	TokLayoutSep col -> col > indent
	_ -> true
	)
	stream
	if Array.null tokens then
	Nothing

	case k error state1.stream of
	Nothing ->
	runFn2 resume (state2 { consumed = state1.consumed }) error

Correctly parse ado without any statements in it #46

Correctly parse ado without any statements in it #46

Uh oh!

Conversation

MonoidMusician commented Feb 22, 2023

Uh oh!

natefaubion Feb 22, 2023

Choose a reason for hiding this comment

Uh oh!

MonoidMusician Feb 22, 2023

Choose a reason for hiding this comment

Uh oh!

natefaubion Feb 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MonoidMusician Feb 23, 2023

Choose a reason for hiding this comment

Uh oh!

natefaubion commented Feb 22, 2023

Uh oh!

Uh oh!

natefaubion commented Mar 5, 2023

Uh oh!

Uh oh!

natefaubion Feb 23, 2023 •

edited

Loading