Skip to content

Commit f31234f

Browse files
committed
AST: Rearrange do to sit inside call/macrocall
`do` syntax is represented in `Expr` with the `do` outside the call. This makes some sense syntactically (do appears as "an operator" after the function call). However semantically this nesting is awkward because the lambda represented by the do block is passed to the call. This same problem occurs for the macro form `@f(x) do \n body end` where the macro expander needs a special rule to expand nestings of the form `Expr(:do, Expr(:macrocall ...), ...)`, rearranging the expression which are passed to this macro call rather than passing the expressions up the tree. In this PR, we change the parsing of @f(x, y) do a, b\n body\n end f(x, y) do a, b\n body\n end to tack the `do` onto the end of the call argument list: (macrocall @f x y (do (tuple a b) body)) (call f x y (do (tuple a b) body)) This achieves the following desirable properties 1. Content of `do` is nested inside the call which improves the match between AST and semantics 2. Macro can be passed the syntax as-is rather than the macro expander rearranging syntax before passing it to the macro 3. In the future, a macro can detect when it's being passed do syntax rather than lambda syntax 4. `do` head is used uniformly for both call and macrocall 5. We preserve the source ordering properties we need for the green tree.
1 parent 6da0fc4 commit f31234f

File tree

5 files changed

+93
-30
lines changed

5 files changed

+93
-30
lines changed

docs/src/reference.md

Lines changed: 28 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ the source text more closely.
4343
* The right hand side of `x where {T}` retains the `K"braces"` node around the `T` to distinguish it from `x where T`.
4444
* Ternary syntax is not immediately lowered to an `if` node: `a ? b : c` parses as `(? a b c)` rather than `Expr(:if, :a, :b, :c)` (#85)
4545
* `global const` and `const global` are not normalized by the parser. This is done in `Expr` conversion (#130)
46-
* The AST for `do` is flatter and not lowered to a lambda by the parser: `f(x) do y ; body end` is parsed as `(do (call f x) (tuple y) (block body))` (#98)
46+
* [`do` syntax](#Do-blocks) is nested as the last child of the call which the `do` lambda will be passed to (#98, #322)
4747
* `@.` is not lowered to `@__dot__` inside the parser (#146)
4848
* Docstrings use the `K"doc"` kind, and are not lowered to `Core.@doc` until later (#217)
4949
* Juxtaposition uses the `K"juxtapose"` kind rather than lowering immediately to `*` (#220)
@@ -77,7 +77,6 @@ class of tokenization errors and lets the parser deal with them.
7777
* We use flags rather than child nodes to represent the difference between `struct` and `mutable struct`, `module` and `baremodule` (#220)
7878
* Multiple iterations within the header of a `for`, as in `for a=as, b=bs body end` are represented with a `cartesian_iterator` head rather than a `block`, as these lists of iterators are neither semantically nor syntactically a sequence of statements. Unlike other uses of `block` (see also generators).
7979

80-
8180
## More detail on tree differences
8281

8382
### Generators
@@ -195,23 +194,38 @@ The same goes for command strings which are always wrapped in `K"cmdstring"`
195194
regardless of whether they have multiple pieces (due to triple-quoted
196195
dedenting) or otherwise.
197196

198-
### No desugaring of the closure in do blocks
197+
### Do blocks
199198

200-
The reference parser represents `do` syntax with a closure for the second
201-
argument. That is,
199+
`do` syntax is represented in the `Expr` AST with the `do` outside the call.
200+
This makes some sense syntactically (do appears as "an operator" after the
201+
function call).
202202

203-
```julia
204-
f(x) do y
205-
body
206-
end
207-
```
203+
However semantically this nesting is awkward because the lambda represented by
204+
the do block is passed to the call. This same problem occurs for the macro form
205+
`@f(x) do \n body end` where the macro expander needs a special rule to expand
206+
nestings of the form `Expr(:do, Expr(:macrocall ...), ...)`, rearranging the
207+
expression which are passed to this macro call rather than passing the
208+
expressions up the tree.
209+
210+
The implied closure is also lowered to a nested `Expr(:->)` expression, though
211+
it this somewhat premature to do this during parsing.
212+
213+
To resolve these problems we parse
214+
215+
@f(x, y) do a, b\n body\n end
216+
f(x, y) do a, b\n body\n end
208217

209-
becomes `(do (call f x) (-> (tuple y) (block body)))` in the reference parser.
218+
by tacking the `do` onto the end of the call argument list:
210219

211-
However, the nested closure with `->` head is implied here rather than present
212-
in the surface syntax, which suggests this is a premature desugaring step.
213-
Instead we emit the flatter structure `(do (call f x) (tuple y) (block body))`.
220+
(macrocall @f x y (do (tuple a b) body))
221+
(call f x y (do (tuple a b) body))
214222

223+
This achieves the following desirable properties
224+
1. Content of `do` is nested inside the call which improves the match between AST and semantics
225+
2. Macro can be passed the syntax as-is rather than the macro expander rearranging syntax before passing it to the macro
226+
3. In the future, a macro can detect when it's being passed do syntax rather than lambda syntax
227+
4. `do` head is used uniformly for both call and macrocall
228+
5. We preserve the source ordering properties we need for the green tree.
215229

216230
## Tree structure reference
217231

src/expr.jl

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,16 @@ function _fixup_Expr_children!(head, loc, args)
184184
return args
185185
end
186186

187+
# Remove the `do` block from the final position in a function/macro call arg list
188+
function _extract_do_lambda!(args)
189+
if length(args) > 1 && Meta.isexpr(args[end], :do_lambda)
190+
do_ex = pop!(args)::Expr
191+
return Expr(:->, do_ex.args...)
192+
else
193+
return nothing
194+
end
195+
end
196+
187197
# Convert internal node of the JuliaSyntax parse tree to an Expr
188198
function _internal_node_to_Expr(source, srcrange, head, childranges, childheads, args)
189199
k = kind(head)
@@ -217,8 +227,12 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
217227
end
218228
end
219229
elseif k == K"macrocall"
230+
do_lambda = _extract_do_lambda!(args)
220231
_reorder_parameters!(args, 2)
221232
insert!(args, 2, loc)
233+
if do_lambda isa Expr
234+
return Expr(:do, Expr(headsym, args...), do_lambda)
235+
end
222236
elseif k == K"block" || (k == K"toplevel" && !has_flags(head, TOPLEVEL_SEMICOLONS_FLAG))
223237
if isempty(args)
224238
push!(args, loc)
@@ -247,6 +261,7 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
247261
popfirst!(args)
248262
headsym = Symbol("'")
249263
end
264+
do_lambda = _extract_do_lambda!(args)
250265
# Move parameters blocks to args[2]
251266
_reorder_parameters!(args, 2)
252267
if headsym === :dotcall
@@ -259,6 +274,9 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
259274
args[1] = Symbol(".", args[1])
260275
end
261276
end
277+
if do_lambda isa Expr
278+
return Expr(:do, Expr(headsym, args...), do_lambda)
279+
end
262280
elseif k == K"." && length(args) == 1 && is_operator(childheads[1])
263281
# Hack: Here we preserve the head of the operator to determine whether
264282
# we need to coalesce it with the dot into a single symbol later on.
@@ -395,8 +413,9 @@ function _internal_node_to_Expr(source, srcrange, head, childranges, childheads,
395413
# as inert QuoteNode rather than in `Expr(:quote)` quasiquote
396414
return QuoteNode(a1)
397415
end
398-
elseif k == K"do" && length(args) == 3
399-
return Expr(:do, args[1], Expr(:->, args[2], args[3]))
416+
elseif k == K"do"
417+
# Temporary head which is picked up by _extract_do_lambda
418+
headsym = :do_lambda
400419
elseif k == K"let"
401420
a1 = args[1]
402421
if @isexpr(a1, :block)

src/parser.jl

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1510,12 +1510,12 @@ function parse_call_chain(ps::ParseState, mark, is_macrocall=false)
15101510
bump_disallowed_space(ps)
15111511
bump(ps, TRIVIA_FLAG)
15121512
parse_call_arglist(ps, K")")
1513-
emit(ps, mark, is_macrocall ? K"macrocall" : K"call",
1514-
is_macrocall ? PARENS_FLAG : EMPTY_FLAGS)
15151513
if peek(ps) == K"do"
1516-
# f(x) do y body end ==> (do (call f x) (tuple y) (block body))
1517-
parse_do(ps, mark)
1514+
# f(x) do y body end ==> (call f x (do (tuple y) (block body)))
1515+
parse_do(ps)
15181516
end
1517+
emit(ps, mark, is_macrocall ? K"macrocall" : K"call",
1518+
is_macrocall ? PARENS_FLAG : EMPTY_FLAGS)
15191519
if is_macrocall
15201520
# @x(a, b) ==> (macrocall-p @x a b)
15211521
# A.@x(y) ==> (macrocall-p (. A (quote @x)) y)
@@ -2274,18 +2274,19 @@ function parse_catch(ps::ParseState)
22742274
end
22752275

22762276
# flisp: parse-do
2277-
function parse_do(ps::ParseState, mark)
2277+
function parse_do(ps::ParseState)
2278+
mark = position(ps)
22782279
bump(ps, TRIVIA_FLAG) # do
22792280
ps = normal_context(ps)
22802281
m = position(ps)
22812282
if peek(ps) in KSet"NewlineWs ;"
2282-
# f() do\nend ==> (do (call f) (tuple) (block))
2283-
# f() do ; body end ==> (do (call f) (tuple) (block body))
2283+
# f() do\nend ==> (call f (do (tuple) (block)))
2284+
# f() do ; body end ==> (call f (do (tuple) (block body)))
22842285
# this trivia needs to go into the tuple due to the way position()
22852286
# works.
22862287
bump(ps, TRIVIA_FLAG)
22872288
else
2288-
# f() do x, y\n body end ==> (do (call f) (tuple x y) (block body))
2289+
# f() do x, y\n body end ==> (call f (do (tuple x y) (block body)))
22892290
parse_comma_separated(ps, parse_range)
22902291
end
22912292
emit(ps, m, K"tuple")

test/expr.jl

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -299,11 +299,39 @@
299299

300300
@testset "do block conversion" begin
301301
@test parsestmt("f(x) do y\n body end") ==
302-
Expr(:do, Expr(:call, :f, :x),
302+
Expr(:do,
303+
Expr(:call, :f, :x),
303304
Expr(:->, Expr(:tuple, :y),
304305
Expr(:block,
305306
LineNumberNode(2),
306307
:body)))
308+
309+
@test parsestmt("@f(x) do y body end") ==
310+
Expr(:do,
311+
Expr(:macrocall, Symbol("@f"), LineNumberNode(1), :x),
312+
Expr(:->, Expr(:tuple, :y),
313+
Expr(:block,
314+
LineNumberNode(1),
315+
:body)))
316+
317+
@test parsestmt("f(x; a=1) do y body end") ==
318+
Expr(:do,
319+
Expr(:call, :f, Expr(:parameters, Expr(:kw, :a, 1)), :x),
320+
Expr(:->, Expr(:tuple, :y),
321+
Expr(:block,
322+
LineNumberNode(1),
323+
:body)))
324+
325+
# Test calls with do inside them
326+
@test parsestmt("g(f(x) do y\n body end)") ==
327+
Expr(:call,
328+
:g,
329+
Expr(:do,
330+
Expr(:call, :f, :x),
331+
Expr(:->, Expr(:tuple, :y),
332+
Expr(:block,
333+
LineNumberNode(2),
334+
:body))))
307335
end
308336

309337
@testset "= to Expr(:kw) conversion" begin
@@ -701,7 +729,7 @@
701729
@test parsestmt("(x", ignore_errors=true) ==
702730
Expr(:block, :x, Expr(:error))
703731
@test parsestmt("x do", ignore_errors=true) ==
704-
Expr(:block, :x, Expr(:error, Expr(:do)))
732+
Expr(:block, :x, Expr(:error, Expr(:do_lambda)))
705733
end
706734

707735
@testset "import" begin

test/parser.jl

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -355,10 +355,11 @@ tests = [
355355
"A.@x(y)" => "(macrocall-p (. A (quote @x)) y)"
356356
"A.@x(y).z" => "(. (macrocall-p (. A (quote @x)) y) (quote z))"
357357
# do
358-
"f() do\nend" => "(do (call f) (tuple) (block))"
359-
"f() do ; body end" => "(do (call f) (tuple) (block body))"
360-
"f() do x, y\n body end" => "(do (call f) (tuple x y) (block body))"
361-
"f(x) do y body end" => "(do (call f x) (tuple y) (block body))"
358+
"f() do\nend" => "(call f (do (tuple) (block)))"
359+
"f() do ; body end" => "(call f (do (tuple) (block body)))"
360+
"f() do x, y\n body end" => "(call f (do (tuple x y) (block body)))"
361+
"f(x) do y body end" => "(call f x (do (tuple y) (block body)))"
362+
"@f(x) do y body end" => "(macrocall-p @f x (do (tuple y) (block body)))"
362363

363364
# square brackets
364365
"@S[a,b]" => "(macrocall @S (vect a b))"

0 commit comments

Comments
 (0)