Skip to content

Commit 3fe86d3

Browse files
authored
AST: Rearrange do to sit inside call/macrocall (#322)
`do` syntax is represented in `Expr` with the `do` outside the call. This makes some sense syntactically (do appears as "an operator" after the function call). However semantically this nesting is awkward because the lambda represented by the do block is passed to the call. This same problem occurs for the macro form `@f(x) do \n body end` where the macro expander needs a special rule to expand nestings of the form `Expr(:do, Expr(:macrocall ...), ...)`, rearranging the expression which are passed to this macro call rather than passing the expressions up the tree. In this PR, we change the parsing of @f(x, y) do a, b\n body\n end f(x, y) do a, b\n body\n end to tack the `do` onto the end of the call argument list: (macrocall @f x y (do (tuple a b) body)) (call f x y (do (tuple a b) body)) This achieves the following desirable properties 1. Content of `do` is nested inside the call which improves the match between AST and semantics 2. Macro can be passed the syntax as-is rather than the macro expander rearranging syntax before passing it to the macro 3. In the future, a macro can detect when it's being passed do syntax rather than lambda syntax 4. `do` head is used uniformly for both call and macrocall 5. We preserve the source ordering properties we need for the green tree.
1 parent 296cd5e commit 3fe86d3

File tree

5 files changed

+93
-30
lines changed

5 files changed

+93
-30
lines changed

docs/src/reference.md

Lines changed: 28 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ the source text more closely.
4343
* The right hand side of `x where {T}` retains the `K"braces"` node around the `T` to distinguish it from `x where T`.
4444
* Ternary syntax is not immediately lowered to an `if` node: `a ? b : c` parses as `(? a b c)` rather than `Expr(:if, :a, :b, :c)` (#85)
4545
* `global const` and `const global` are not normalized by the parser. This is done in `Expr` conversion (#130)
46-
* The AST for `do` is flatter and not lowered to a lambda by the parser: `f(x) do y ; body end` is parsed as `(do (call f x) (tuple y) (block body))` (#98)
46+
* [`do` syntax](#Do-blocks) is nested as the last child of the call which the `do` lambda will be passed to (#98, #322)
4747
* `@.` is not lowered to `@__dot__` inside the parser (#146)
4848
* Docstrings use the `K"doc"` kind, and are not lowered to `Core.@doc` until later (#217)
4949
* Juxtaposition uses the `K"juxtapose"` kind rather than lowering immediately to `*` (#220)
@@ -78,7 +78,6 @@ class of tokenization errors and lets the parser deal with them.
7878
* We use flags rather than child nodes to represent the difference between `struct` and `mutable struct`, `module` and `baremodule` (#220)
7979
* Multiple iterations within the header of a `for`, as in `for a=as, b=bs body end` are represented with a `cartesian_iterator` head rather than a `block`, as these lists of iterators are neither semantically nor syntactically a sequence of statements. Unlike other uses of `block` (see also generators).
8080

81-
8281
## More detail on tree differences
8382

8483
### Generators
@@ -196,23 +195,38 @@ The same goes for command strings which are always wrapped in `K"cmdstring"`
196195
regardless of whether they have multiple pieces (due to triple-quoted
197196
dedenting) or otherwise.
198197

199-
### No desugaring of the closure in do blocks
198+
### Do blocks
200199

201-
The reference parser represents `do` syntax with a closure for the second
202-
argument. That is,
200+
`do` syntax is represented in the `Expr` AST with the `do` outside the call.
201+
This makes some sense syntactically (do appears as "an operator" after the
202+
function call).
203203

204-
```julia
205-
f(x) do y
206-
body
207-
end
208-
```
204+
However semantically this nesting is awkward because the lambda represented by
205+
the do block is passed to the call. This same problem occurs for the macro form
206+
`@f(x) do \n body end` where the macro expander needs a special rule to expand
207+
nestings of the form `Expr(:do, Expr(:macrocall ...), ...)`, rearranging the
208+
expression which are passed to this macro call rather than passing the
209+
expressions up the tree.
210+
211+
The implied closure is also lowered to a nested `Expr(:->)` expression, though
212+
it this somewhat premature to do this during parsing.
213+
214+
To resolve these problems we parse
215+
216+
@f(x, y) do a, b\n body\n end
217+
f(x, y) do a, b\n body\n end
209218

210-
becomes `(do (call f x) (-> (tuple y) (block body)))` in the reference parser.
219+
by tacking the `do` onto the end of the call argument list:
211220

212-
However, the nested closure with `->` head is implied here rather than present
213-
in the surface syntax, which suggests this is a premature desugaring step.
214-
Instead we emit the flatter structure `(do (call f x) (tuple y) (block body))`.
221+
(macrocall @f x y (do (tuple a b) body))
222+
(call f x y (do (tuple a b) body))
215223

224+
This achieves the following desirable properties
225+
1. Content of `do` is nested inside the call which improves the match between AST and semantics
226+
2. Macro can be passed the syntax as-is rather than the macro expander rearranging syntax before passing it to the macro
227+
3. In the future, a macro can detect when it's being passed do syntax rather than lambda syntax
228+
4. `do` head is used uniformly for both call and macrocall
229+
5. We preserve the source ordering properties we need for the green tree.
216230

217231
## Tree structure reference
218232

src/expr.jl

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,16 @@ function _fixup_Expr_children!(head, loc, args)
184184
return args
185185
end
186186

187+
# Remove the `do` block from the final position in a function/macro call arg list
188+
function _extract_do_lambda!(args)
189+
if length(args) > 1 && Meta.isexpr(args[end], :do_lambda)
190+
do_ex = pop!(args)::Expr
191+
return Expr(:->, do_ex.args...)
192+
else
193+
return nothing
194+
end
195+
end
196+
187197
# Convert internal node of the JuliaSyntax parse tree to an Expr
188198
function _internal_node_to_Expr(source, srcrange, head, childranges, childheads, args)
189199
k = kind(head)
@@ -217,8 +227,12 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
217227
end
218228
end
219229
elseif k == K"macrocall"
230+
do_lambda = _extract_do_lambda!(args)
220231
_reorder_parameters!(args, 2)
221232
insert!(args, 2, loc)
233+
if do_lambda isa Expr
234+
return Expr(:do, Expr(headsym, args...), do_lambda)
235+
end
222236
elseif k == K"block" || (k == K"toplevel" && !has_flags(head, TOPLEVEL_SEMICOLONS_FLAG))
223237
if isempty(args)
224238
push!(args, loc)
@@ -247,6 +261,7 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
247261
popfirst!(args)
248262
headsym = Symbol("'")
249263
end
264+
do_lambda = _extract_do_lambda!(args)
250265
# Move parameters blocks to args[2]
251266
_reorder_parameters!(args, 2)
252267
if headsym === :dotcall
@@ -259,6 +274,9 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
259274
args[1] = Symbol(".", args[1])
260275
end
261276
end
277+
if do_lambda isa Expr
278+
return Expr(:do, Expr(headsym, args...), do_lambda)
279+
end
262280
elseif k == K"."
263281
if length(args) == 2
264282
a2 = args[2]
@@ -402,8 +420,9 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
402420
# as inert QuoteNode rather than in `Expr(:quote)` quasiquote
403421
return QuoteNode(a1)
404422
end
405-
elseif k == K"do" && length(args) == 3
406-
return Expr(:do, args[1], Expr(:->, args[2], args[3]))
423+
elseif k == K"do"
424+
# Temporary head which is picked up by _extract_do_lambda
425+
headsym = :do_lambda
407426
elseif k == K"let"
408427
a1 = args[1]
409428
if @isexpr(a1, :block)

src/parser.jl

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1510,12 +1510,12 @@ function parse_call_chain(ps::ParseState, mark, is_macrocall=false)
15101510
bump_disallowed_space(ps)
15111511
bump(ps, TRIVIA_FLAG)
15121512
parse_call_arglist(ps, K")")
1513-
emit(ps, mark, is_macrocall ? K"macrocall" : K"call",
1514-
is_macrocall ? PARENS_FLAG : EMPTY_FLAGS)
15151513
if peek(ps) == K"do"
1516-
# f(x) do y body end ==> (do (call f x) (tuple y) (block body))
1517-
parse_do(ps, mark)
1514+
# f(x) do y body end ==> (call f x (do (tuple y) (block body)))
1515+
parse_do(ps)
15181516
end
1517+
emit(ps, mark, is_macrocall ? K"macrocall" : K"call",
1518+
is_macrocall ? PARENS_FLAG : EMPTY_FLAGS)
15191519
if is_macrocall
15201520
# @x(a, b) ==> (macrocall-p @x a b)
15211521
# A.@x(y) ==> (macrocall-p (. A @x) y)
@@ -2266,18 +2266,19 @@ function parse_catch(ps::ParseState)
22662266
end
22672267

22682268
# flisp: parse-do
2269-
function parse_do(ps::ParseState, mark)
2269+
function parse_do(ps::ParseState)
2270+
mark = position(ps)
22702271
bump(ps, TRIVIA_FLAG) # do
22712272
ps = normal_context(ps)
22722273
m = position(ps)
22732274
if peek(ps) in KSet"NewlineWs ;"
2274-
# f() do\nend ==> (do (call f) (tuple) (block))
2275-
# f() do ; body end ==> (do (call f) (tuple) (block body))
2275+
# f() do\nend ==> (call f (do (tuple) (block)))
2276+
# f() do ; body end ==> (call f (do (tuple) (block body)))
22762277
# this trivia needs to go into the tuple due to the way position()
22772278
# works.
22782279
bump(ps, TRIVIA_FLAG)
22792280
else
2280-
# f() do x, y\n body end ==> (do (call f) (tuple x y) (block body))
2281+
# f() do x, y\n body end ==> (call f (do (tuple x y) (block body)))
22812282
parse_comma_separated(ps, parse_range)
22822283
end
22832284
emit(ps, m, K"tuple")

test/expr.jl

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -296,11 +296,39 @@
296296

297297
@testset "do block conversion" begin
298298
@test parsestmt("f(x) do y\n body end") ==
299-
Expr(:do, Expr(:call, :f, :x),
299+
Expr(:do,
300+
Expr(:call, :f, :x),
300301
Expr(:->, Expr(:tuple, :y),
301302
Expr(:block,
302303
LineNumberNode(2),
303304
:body)))
305+
306+
@test parsestmt("@f(x) do y body end") ==
307+
Expr(:do,
308+
Expr(:macrocall, Symbol("@f"), LineNumberNode(1), :x),
309+
Expr(:->, Expr(:tuple, :y),
310+
Expr(:block,
311+
LineNumberNode(1),
312+
:body)))
313+
314+
@test parsestmt("f(x; a=1) do y body end") ==
315+
Expr(:do,
316+
Expr(:call, :f, Expr(:parameters, Expr(:kw, :a, 1)), :x),
317+
Expr(:->, Expr(:tuple, :y),
318+
Expr(:block,
319+
LineNumberNode(1),
320+
:body)))
321+
322+
# Test calls with do inside them
323+
@test parsestmt("g(f(x) do y\n body end)") ==
324+
Expr(:call,
325+
:g,
326+
Expr(:do,
327+
Expr(:call, :f, :x),
328+
Expr(:->, Expr(:tuple, :y),
329+
Expr(:block,
330+
LineNumberNode(2),
331+
:body))))
304332
end
305333

306334
@testset "= to Expr(:kw) conversion" begin
@@ -708,7 +736,7 @@
708736
@test parsestmt("(x", ignore_errors=true) ==
709737
Expr(:block, :x, Expr(:error))
710738
@test parsestmt("x do", ignore_errors=true) ==
711-
Expr(:block, :x, Expr(:error, Expr(:do)))
739+
Expr(:block, :x, Expr(:error, Expr(:do_lambda)))
712740
end
713741

714742
@testset "import" begin

test/parser.jl

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -355,10 +355,11 @@ tests = [
355355
"A.@x(y)" => "(macrocall-p (. A @x) y)"
356356
"A.@x(y).z" => "(. (macrocall-p (. A @x) y) z)"
357357
# do
358-
"f() do\nend" => "(do (call f) (tuple) (block))"
359-
"f() do ; body end" => "(do (call f) (tuple) (block body))"
360-
"f() do x, y\n body end" => "(do (call f) (tuple x y) (block body))"
361-
"f(x) do y body end" => "(do (call f x) (tuple y) (block body))"
358+
"f() do\nend" => "(call f (do (tuple) (block)))"
359+
"f() do ; body end" => "(call f (do (tuple) (block body)))"
360+
"f() do x, y\n body end" => "(call f (do (tuple x y) (block body)))"
361+
"f(x) do y body end" => "(call f x (do (tuple y) (block body)))"
362+
"@f(x) do y body end" => "(macrocall-p @f x (do (tuple y) (block body)))"
362363

363364
# square brackets
364365
"@S[a,b]" => "(macrocall @S (vect a b))"

0 commit comments

Comments
 (0)