Skip to content
This repository was archived by the owner on Nov 20, 2020. It is now read-only.

Commit 0b562ba

Browse files
author
hanjos
committed
scanner: removed scanner.lua and related files, putting its functionality in parser.lua; updated documentation and examples
1 parent d947b36 commit 0b562ba

18 files changed

+1047
-1315
lines changed

Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#
22
# Makefile for Leg
3-
# $Id: Makefile,v 1.7 2007/11/26 18:41:51 hanjos Exp $
3+
# $Id: Makefile,v 1.8 2007/12/07 14:23:56 hanjos Exp $
44
#
55

66
# ===== LUA PATHS =================
@@ -10,7 +10,7 @@ LUA_LIB = /usr/local/share/lua/5.1
1010
# ===== PROJECT INFO ==============
1111
# project info
1212
NAME = leg
13-
VERSION = 0.1.2
13+
VERSION = 0.2
1414

1515
# project directories
1616
DOC_DIR = doc
@@ -31,7 +31,7 @@ install:
3131
# copying the source files to LUA_LIB
3232
mkdir -p $(LUA_LIB)/$(NAME)
3333
rm -f $(LUA_LIB)/$(NAME)/*.lua
34-
cp src/*.lua $(LUA_LIB)/$(NAME)
34+
cp -r src/* $(LUA_LIB)/$(NAME)
3535

3636
clean:
3737
# removing the source files and package

doc/grammar.html

Lines changed: 96 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919
<div id="product_description"> LPeg grammar manipulation </div>
2020
<br/>
2121

22-
<div id="product_description"> <small><em> <b>Version:</b> 0.1.2 </em></small></div>
23-
<div id="product_description"> <small><em><b>Generated:</b> November 26, 2007 </em></small> </div>
22+
<div id="product_description"> <small><em> <b>Version:</b> 0.2 </em></small></div>
23+
<div id="product_description"> <small><em><b>Generated:</b> December 07, 2007 </em></small> </div>
2424

2525
</div> <!-- id="product" -->
2626

@@ -31,7 +31,6 @@ <h1><a href="index.html">Modules</a></h1>
3131
<ul>
3232
<li><a href="grammar.html">grammar</a></li>
3333
<li><a href="parser.html">parser</a></li>
34-
<li><a href="scanner.html">scanner</a></li>
3534
</ul>
3635
<br/>
3736
<h1>grammar</h1>
@@ -86,6 +85,8 @@ <h1>grammar</h1>
8685

8786
<li><code><a href="#function_anyOf"> anyOf </a></code></li>
8887

88+
<li><code><a href="#function_anywhere"> anywhere </a></code></li>
89+
8990
<li><code><a href="#function_apply"> apply </a></code></li>
9091

9192
<li><code><a href="#function_C"> C </a></code></li>
@@ -98,8 +99,12 @@ <h1>grammar</h1>
9899

99100
<li><code><a href="#function_listOf"> listOf </a></code></li>
100101

102+
<li><code><a href="#function_oneOf"> oneOf </a></code></li>
103+
101104
<li><code><a href="#function_pipe"> pipe </a></code></li>
102105

106+
<li><code><a href="#function_pmatch"> pmatch </a></code></li>
107+
103108
</ul>
104109
</li>
105110

@@ -198,21 +203,28 @@ <h2><a name="section_Example"></a>Example</h2>
198203
<p/>
199204
In the code above only <code>sum</code> and <code>something:other</code> should be documented, as <code>f1</code> isn't properly (by our standards) declared and <code>aux</code> is not in the outermost scope.
200205
<p/>
201-
By combining <a href="http://www.inf.puc-rio.br/~roberto/lpeg.html">LPeg</a> and the modules <a href="scanner.html">scanner</a>, <a href="parser.html">parser</a> and <a href="grammar.html">grammar</a>, this specific problem can be solved as follows:
206+
By combining <a href="http://www.inf.puc-rio.br/~roberto/lpeg.html">LPeg</a> and the modules <a href="parser.html">parser</a> and <a href="grammar.html">grammar</a>, this specific problem can be solved as follows:
202207
<p/>
203208
<pre class="example">
209+
<font color="#808080"> -- ye olde imports</font>
210+
<font color="#0000FF"><b>local</b></font> parser, grammar = <b><font color ="#800040">require</font></b> <font color="#008000">'leg.parser'</font>, <b><font color ="#800040">require</font></b> <font color="#008000">'leg.grammar'</font>
211+
<font color="#0000FF"><b>local</b></font> lpeg = <b><font color ="#800040">require</font></b> <font color="#008000">'lpeg'</font>
212+
213+
<font color="#808080"> -- a little aliasing never hurt anyone</font>
214+
<font color="#0000FF"><b>local</b></font> P, V = lpeg.P, lpeg.V
215+
204216
<font color="#808080"> -- change only the initial rule and make no captures</font>
205-
patt = <b>grammar.apply</b>(parser.rules, scanner.COMMENT^-<font color="#B00000"><i>1</i></font> * <b>lpeg.V</b><font color="#008000">'GlobalFunction'</font>, <font color="#0000FF"><b>nil</b></font>)
217+
patt = <b>grammar.apply</b>(parser.rules, parser.COMMENT^-<font color="#B00000"><i>1</i></font> * <b>V</b><font color="#008000">'GlobalFunction'</font>, nil)
206218

207219
<font color="#808080"> -- transform the new grammar into a LPeg pattern</font>
208-
patt = <b>lpeg.P</b>(patt)
220+
patt = <b>P</b>(patt)
209221

210222
<font color="#808080"> -- making a pattern that matches any Lua statement, also without captures</font>
211-
Stat = <b>lpeg.P</b>( <b>grammar.apply</b>(parser.rules, <b>lpeg.V</b><font color="#008000">'Stat'</font>, <font color="#0000FF"><b>nil</b></font>) )
223+
Stat = <b>P</b>( <b>grammar.apply</b>(parser.rules, <b>V</b><font color="#008000">'Stat'</font>, nil) )
212224

213225
<font color="#808080"> -- a pattern which matches function declarations and skips statements in</font>
214226
<font color="#808080"> -- inner scopes or undesired tokens</font>
215-
patt = (patt + Stat + scanner.ANY)^<font color="#B00000"><i>0</i></font>
227+
patt = (patt + Stat + parser.ANY)^<font color="#B00000"><i>0</i></font>
216228

217229
<font color="#808080"> -- matching a string</font>
218230
<b>patt:match</b>(subject)
@@ -226,8 +238,8 @@ <h2><a name="section_Example"></a>Example</h2>
226238
FuncBody = <font color="#008000">'('</font> * (ParList + EPSILON) * <font color="#008000">')'</font> * Block * <font color="#008000">'end'</font>
227239
ParList = NameList * (<font color="#008000">','</font> * <font color="#008000">'...'</font>)^-<font color="#B00000"><i>1</i></font>
228240
NameList = ID * (<font color="#008000">','</font> * ID)^<font color="#B00000"><i>0</i></font>
229-
ID = scanner.ID
230-
EPSILON = <b>lpeg.P</b>(<font color="#0000FF"><b>true</b></font>)
241+
ID = parser.IDENTIFIER
242+
EPSILON = <b>P</b>(<font color="#0000FF"><b>true</b></font>)
231243
</pre>
232244
<p/>
233245
It may seem that <code>ParList + EPSILON</code> could be substituted for <code>ParList^-1</code> (optionally match <code>ParList</code>), but then no captures would be made for empty parameter lists, and <code>GlobalFunction</code> would get all strings matched by <code>FuncBody</code>. The <code>EPSILON</code> rule acts in this manner as a placeholder in the argument list, avoiding any argument list processing in the capture function.
@@ -249,23 +261,23 @@ <h2><a name="section_Example"></a>Example</h2>
249261

250262
FuncName = grammar.C, <font color="#808080"> -- capture the raw text</font>
251263
ParList = grammar.C, <font color="#808080"> -- capture the raw text</font>
252-
COMMENT = scanner.comment2text, <font color="#808080"> -- remove the comment trappings</font>
264+
COMMENT = parser.comment2text, <font color="#808080"> -- remove the comment trappings</font>
253265
}
254266

255267
<font color="#808080"> -- spacing rule</font>
256-
<font color="#0000FF"><b>local</b></font> S = scanner.SPACE ^ <font color="#B00000"><i>0</i></font>
268+
<font color="#0000FF"><b>local</b></font> S = parser.SPACE ^ <font color="#B00000"><i>0</i></font>
257269

258270
<font color="#808080"> -- rules table</font>
259271
rules = {
260-
[<font color="#B00000"><i>1</i></font>] = ((<b>lpeg.V</b><font color="#008000">'COMMENT'</font> *S) ^ <font color="#B00000"><i>0</i></font>) *S* <b>lpeg.V</b><font color="#008000">'GlobalFunction'</font>,
261-
COMMENT = scanner.COMMENT,
272+
[<font color="#B00000"><i>1</i></font>] = ((<b>V</b><font color="#008000">'COMMENT'</font> *S) ^ <font color="#B00000"><i>0</i></font>) *S* <b>V</b><font color="#008000">'GlobalFunction'</font>,
273+
COMMENT = parser.COMMENT,
262274
}
263275

264276
<font color="#808080"> -- building the new grammar and adding the captures</font>
265-
patt = <b>lpeg.P</b>( <b>grammar.apply</b>(parser.rules, rules, captures) )
277+
patt = <b>P</b>( <b>grammar.apply</b>(parser.rules, rules, captures) )
266278

267279
<font color="#808080"> -- a pattern that matches a sequence of patts and concatenates the results</font>
268-
patt = (patt + Stat + scanner.ANY)^<font color="#B00000"><i>0</i></font> / <font color="#0000FF"><b>function</b></font>(...)
280+
patt = (patt + Stat + parser.ANY)^<font color="#B00000"><i>0</i></font> / <font color="#0000FF"><b>function</b></font>(...)
269281
<font color="#0000FF"><b>return</b></font> <font color = "#800040"><b>table.concat</b></font>({...}, <font color="#008000">'\n\n'</font>) <font color="#808080"> -- some line breaks for easier reading</font>
270282
<font color="#0000FF"><b>end</b></font>
271283

@@ -284,7 +296,6 @@ <h2><a name="section_Example"></a>Example</h2>
284296
</pre>
285297
<p/>
286298
<p/>
287-
<p/>
288299

289300

290301

@@ -295,9 +306,15 @@ <h1><a name="functions"></a> Functions </h1>
295306
<table border="0" width="95%">
296307

297308
<tr>
298-
<!-- <td> <pre class="example"><big><strong><a href="#functions_anyOf">anyOf</a></strong></big> (list)</pre> </td> -->
299-
<td> <code><big><strong> <a href="#function_anyOf">anyOf</a></strong></big> (list) </code> </td>
300-
<td> Returns a pattern which matches any of the patterns received. </td>
309+
<!-- <td> <pre class="example"><big><strong><a href="#functions_anyOf">anyOf</a></strong></big> (t)</pre> </td> -->
310+
<td> <code><big><strong> <a href="#function_anyOf">anyOf</a></strong></big> (t) </code> </td>
311+
<td> Returns a pattern which matches any of the patterns in <code>t</code>. </td>
312+
</tr>
313+
314+
<tr>
315+
<!-- <td> <pre class="example"><big><strong><a href="#functions_anywhere">anywhere</a></strong></big> (patt)</pre> </td> -->
316+
<td> <code><big><strong> <a href="#function_anywhere">anywhere</a></strong></big> (patt) </code> </td>
317+
<td> Returns a pattern which searches for the pattern <code>patt</code> anywhere in a string. </td>
301318
</tr>
302319

303320
<tr>
@@ -336,41 +353,69 @@ <h1><a name="functions"></a> Functions </h1>
336353
<td> Returns a pattern which matches a list of <code>patt</code>s, separated by <code>sep</code>. </td>
337354
</tr>
338355

356+
<tr>
357+
<!-- <td> <pre class="example"><big><strong><a href="#functions_oneOf">oneOf</a></strong></big> (list)</pre> </td> -->
358+
<td> <code><big><strong> <a href="#function_oneOf">oneOf</a></strong></big> (list) </code> </td>
359+
<td> Returns a pattern which matches any of the patterns in <code>list</code>. </td>
360+
</tr>
361+
339362
<tr>
340363
<!-- <td> <pre class="example"><big><strong><a href="#functions_pipe">pipe</a></strong></big> (dest, orig)</pre> </td> -->
341364
<td> <code><big><strong> <a href="#function_pipe">pipe</a></strong></big> (dest, orig) </code> </td>
342365
<td> <a href="#section_Piping">Pipes</a> the captures in <code>orig</code> to the ones in <code>dest</code>. </td>
343366
</tr>
344367

368+
<tr>
369+
<!-- <td> <pre class="example"><big><strong><a href="#functions_pmatch">pmatch</a></strong></big> (patt)</pre> </td> -->
370+
<td> <code><big><strong> <a href="#function_pmatch">pmatch</a></strong></big> (patt) </code> </td>
371+
<td> Returns a pattern which simply fails to match if an error is thrown during the matching. </td>
372+
</tr>
373+
345374
</table>
346375

347376

348377
<p/><a name="function_anyOf"></a>
349-
<hr/><code><big>anyOf (list)</big></code>
350-
<ul>Returns a pattern which matches any of the patterns received.
378+
<hr/><code><big>anyOf (t)</big></code>
379+
<ul>Returns a pattern which matches any of the patterns in <code>t</code>.
380+
<p/>
381+
The iterator <code>pairs</code> is used to traverse <code>t</code>, so no particular traversal order
382+
is guaranteed. Use <a href="#function_oneOf">oneOf</a> to ensure sequential matching
383+
attempts.
351384
<p/>
352385
<strong>Example:</strong>
353386
<pre class="example">
354-
<font color="#0000FF"><b>local</b></font> g, s, m = <b><font color ="#800040">require</font></b> <font color="#008000">'leg.grammar'</font>, <b><font color ="#800040">require</font></b> <font color="#008000">'leg.scanner'</font>, <b><font color ="#800040">require</font></b> <font color="#008000">'lpeg'</font>
387+
<font color="#0000FF"><b>local</b></font> g, p, m = <b><font color ="#800040">require</font></b> <font color="#008000">'leg.grammar'</font>, <b><font color ="#800040">require</font></b> <font color="#008000">'leg.parser'</font>, <b><font color ="#800040">require</font></b> <font color="#008000">'lpeg'</font>
355388

356389
<font color="#808080"> -- match numbers or operators, capture the numbers</font>
357-
<b><font color ="#800040">print</font></b>( (<b>g.anyOf</b> { <font color="#008000">'+'</font>, <font color="#008000">'-'</font>, <font color="#008000">'*'</font>, <font color="#008000">'/'</font>, <b>m.C</b>(s.NUMBER) }):match <font color="#008000">'34.5@23 * 56 / 45 - 45'</font> )
390+
<b><font color ="#800040">print</font></b>( (<b>g.anyOf</b> { <font color="#008000">'+'</font>, <font color="#008000">'-'</font>, <font color="#008000">'*'</font>, <font color="#008000">'/'</font>, <b>m.C</b>(p.NUMBER) }):match <font color="#008000">'34.5@23 * 56 / 45 - 45'</font> )
358391
<font color="#808080"> --> prints 34.5</font>
359392
</pre>
360393
<p/>
361394
<strong>Parameters:</strong><ul>
362-
<li><code>list</code>: a list of zero or more LPeg patterns or values which can be fed to <a href="http://www.inf.puc-rio.br/~roberto/lpeg.html#lpeg">lpeg.P</a>.</li></ul>
395+
<li><code>t</code>: a table with LPeg patterns as values. The keys are ignored.</li></ul>
363396
<p/>
364397
<strong>Returns:</strong><ul>
365398
<li>a pattern which matches any of the patterns received.</li></ul></ul>
366399

400+
<p/><a name="function_anywhere"></a>
401+
<hr/><code><big>anywhere (patt)</big></code>
402+
<ul>Returns a pattern which searches for the pattern <code>patt</code> anywhere in a string.
403+
<p/>
404+
This code was extracted from the <a href="http://www.inf.puc-rio.br/~roberto/lpeg.html#ex">LPeg home page</a>, in the examples section.
405+
<p/>
406+
<strong>Parameters:</strong><ul>
407+
<li><code>patt</code>: a LPeg pattern.</li></ul>
408+
<p/>
409+
<strong>Returns:</strong><ul>
410+
<li>a LPeg pattern which searches for <code>patt</code> anywhere in the string.</li></ul></ul>
411+
367412
<p/><a name="function_apply"></a>
368413
<hr/><code><big>apply (grammar, rules, captures)</big></code>
369414
<ul><a href="#section_Completing">Completes</a> <code>rules</code> with <code>grammar</code> and then <a href="#Applying">applies</a> <code>captures</code>.
370415
<p/>
371416
<code>rules</code> can either be:<ul>
372417
<li>a single pattern, which is taken to be the new initial rule, </li>
373-
<li>a possibly incomplete LPeg grammar, as per <a href="#function_complete">complete</a>, or </li>
418+
<li>a possibly incomplete LPeg grammar table, as per <a href="#function_complete">complete</a>, or </li>
374419
<li><code>nil</code>, which means no new rules are added.</li></ul>
375420
<p/>
376421
<code>captures</code> can either be:<ul>
@@ -445,6 +490,19 @@ <h1><a name="functions"></a> Functions </h1>
445490
<strong>Returns:</strong><ul>
446491
<li>the following pattern: <pre class="example">patt * (sep * patt)^<font color="#B00000"><i>0</i></font></pre></li></ul></ul>
447492

493+
<p/><a name="function_oneOf"></a>
494+
<hr/><code><big>oneOf (list)</big></code>
495+
<ul>Returns a pattern which matches any of the patterns in <code>list</code>.
496+
<p/>
497+
Differently from <a href="#function_anyOf">anyOf</a>, this function ensures sequential
498+
traversing.
499+
<p/>
500+
<strong>Parameters:</strong><ul>
501+
<li><code>list</code>: a list of LPeg patterns.</li></ul>
502+
<p/>
503+
<strong>Returns:</strong><ul>
504+
<li>a pattern which matches any of the patterns received.</li></ul></ul>
505+
448506
<p/><a name="function_pipe"></a>
449507
<hr/><code><big>pipe (dest, orig)</big></code>
450508
<ul><a href="#section_Piping">Pipes</a> the captures in <code>orig</code> to the ones in <code>dest</code>.
@@ -458,6 +516,18 @@ <h1><a name="functions"></a> Functions </h1>
458516
<strong>Returns:</strong><ul>
459517
<li><code>dest</code>, suitably modified.</li></ul></ul>
460518

519+
<p/><a name="function_pmatch"></a>
520+
<hr/><code><big>pmatch (patt)</big></code>
521+
<ul>Returns a pattern which simply fails to match if an error is thrown during the matching.
522+
<p/>
523+
One usage example is <a href="parser.html#variable_NUMBER">parser.NUMBER</a>. Originally it threw an error when trying to match a malformed number (such as 1e23e4), since in this case the input is obviously invalid and the pattern would be part of the Lua grammar. So <a href="#function_pmatch">pmatch</a> is used to catch the error and return <code>nil</code> (signalling a non-match) and the error message.
524+
<p/>
525+
<strong>Parameters:</strong><ul>
526+
<li><code>patt</code>: a LPeg pattern.</li></ul>
527+
<p/>
528+
<strong>Returns:</strong><ul>
529+
<li>a pattern which catches any errors thrown during the matching and simply doesn't match instead of propagating the error.</li></ul></ul>
530+
461531
<hr/>
462532

463533

doc/index.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919
<div id="product_description"> LPeg-powered Lua 5.1 grammar </div>
2020
<br/>
2121

22-
<div id="product_description"> <small><em> <b>Version:</b> 0.1.2 </em></small></div>
23-
<div id="product_description"> <small><em><b>Generated:</b> November 26, 2007 </em></small> </div>
22+
<div id="product_description"> <small><em> <b>Version:</b> 0.2 </em></small></div>
23+
<div id="product_description"> <small><em><b>Generated:</b> December 07, 2007 </em></small> </div>
2424

2525
</div> <!-- id="product" -->
2626

@@ -31,7 +31,6 @@ <h1><a href="index.html">Modules</a></h1>
3131
<ul>
3232
<li><a href="grammar.html">grammar</a></li>
3333
<li><a href="parser.html">parser</a></li>
34-
<li><a href="scanner.html">scanner</a></li>
3534
</ul>
3635
<br/>
3736
<h1>Leg</h1>

0 commit comments

Comments
 (0)