A PHP code generation plugin for Peggy.
PHPeggy is the successor of phpegjs
which had been abandoned by its maintainer.
Peggy version 1.x.x is compatible with the most recent phpegjs release. Follow these steps to upgrade:
There are a few API changes compared to the most recent phpegjs
release.
- Options specific to PHPeggy have to be passed to
phpeggy
and not tophpegjs
.
Follow these steps to upgrade:
- Follow the migration instructions from Peggy.
- Uninstall
phpegjs
. - Replace all
require("phpegjs")
orimport ... from "phpegjs"
withrequire("phpeggy")
orimport ... from "phpeggy"
as appropriate. - PHPeggy-specific options are now passed to
phpeggy
:var parser = peggy.generate("start = ('a' / 'b')+", { - plugins: [require("phpegjs")], + plugins: [require("phpeggy")], - phpegjs: { /* phpegjs-specific options */ } + phpeggy: { /* phpeggy-specific options */ } });
- That's it!
- Peggy (known compatible with v3.0.0)
Install Peggy with phpeggy
plugin
$ npm install peggy@3.0.0 phpeggy
In Node.js, require both the Peggy parser generator and the phpeggy
plugin:
var peggy = require("peggy");
var phpeggy = require("phpeggy");
To generate a PHP parser, pass both the phpeggy
plugin and your grammar to
peggy.generate
:
var parser = peggy.generate("start = ('a' / 'b')+", {
plugins: [phpeggy]
});
The method will return source code of generated parser as a string. Unlike original Peggy, generated PHP parser will be a class, not a function.
Supported options of peggy.generate
:
allowedStartRules
— rules the parser will be allowed to start parsing from (default: the first rule in the grammar)cache
— iftrue
, makes the parser cache results, avoiding exponential parsing time in pathological cases but making the parser slower (default:false
). In case of PHP, this is strongly recommended for big grammars (like javascript.pegjs or css.pegjs in example folder)grammarSource
— this object will be passed to any location() objects as the source property (default: undefined). This object will be used even if options.grammarSource is redefined in the grammar. It is useful to attach the file information to the errors, for example
You can also pass options specific to the PHPeggy plugin as follows:
var parser = peggy.generate("start = ('a' / 'b')+", {
plugins: [phpeggy],
phpeggy: { /* phpeggy-specific options */ }
});
Here are the options available to pass this way:
parserNamespace
- namespace of generated parser (default:PHPeggy
). If value is''
ornull
, no namespace will be used.parserClassName
- name of generated class for parser (default:Parser
).mbstringAllowed
- whether to allow usage of PHP'smb_*
functions which depend on thembstring
extension being installed (default:true
). This can be disabled for compatibility with a wider range of PHP configurations, but this will also disable several features of Peggy (case-insensitive string matching, case-insensitive character classes, and empty character classes). Attempting to use these features withmbstringAllowed: false
will causepasses.check
to throw an error.header
- you can provide a custom header that will be added at the top of the parser, e.g./* My custom header */
.
-
Save parser generated by
peggy.generate
to a file -
In PHP code:
include "your.parser.file.php";
try {
$parser = new PHPeggy\Parser;
$result = $parser->parse($input);
} catch (PHPeggy\SyntaxError $ex) {
// Handle parsing error
// [...]
}
You can use the following snippet to format parsing errors:
catch (PHPeggy\SyntaxError $e) {
$message = "Syntax error: " . $e->getMessage() . " at line " . $e->grammarLine . " column " . $e->grammarColumn . " offset " . $e->grammarOffset;
}
Or use SyntaxError->format():
catch (PHPeggy\SyntaxError $e) {
$errorFormatted = $e->format(array(array("source" => "User input", "text" => $user_input)));
}
Which will look similar to:
SyntaxError: Expected "a" but "b" found.
--> Input string:1:1
|
1 | b
| ^
Note that the generated PHP parser will call preg_match_all( '/./us', ... )
on the input string. This may be undesirable for projects that need to
maintain compatibility with PCRE versions that are missing Unicode support
(WordPress, for example). To avoid this call, split the input string into an
array (one array element per UTF-8 character) and pass this array into
$parser->parse()
instead of the string input.
See documentation of Peggy with following differences:
- action and predicate blocks should be written in PHP.
- the per-parse initializer code block is used to provide additional methods, properties and constants to the Parser class. A special method
function initialize()
can be provided and resembles the Peggy per-parse initializer i.e. this method is called before the generated parser starts parsing (see examples/fizzbuzz.pegjs). All methods have access to the input ($this->input
) and the options ($this->options
). - the global initializer code block can be used to add use statements, classes, functions, constants, ...
Original Peggy rule:
media_list = head:medium tail:("," S* medium)* {
var result = [head];
for (var i = 0; i < tail.length; i++) {
result.push(tail[i][2]);
}
return result;
}
PHPeggy rule:
media_list = head:medium tail:("," S* medium)* {
$result = [$head];
for ($i = 0; $i < \count($tail); $i++) {
$result[] = $tail[$i][2];
}
return $result;
}
To target both JavaScript and PHP with a single grammar, you can mix the two languages using a special comment syntax:
media_list = head:medium tail:("," S* medium)* {
/** <?php
$result = [$head];
for ($i = 0; $i < \count($tail); $i++) {
$result[] = $tail[$i][2];
}
return $result;
?> **/
var result = [head];
for (var i = 0; i < tail.length; i++) {
result.push(tail[i][2]);
}
return result;
}
You can also use the following utility functions in PHP action blocks:
chr_unicode($code)
- return character by its UTF-8 code (analogue of JavaScript'sString.fromCharCode
function).ord_unicode($code)
- return the UTF-8 code for a character (analogue of JavaScript'sString.prototype.charCodeAt(0)
function).
Javascript code | PHP analogue |
---|---|
some_var |
$some_var |
{f1: "val1", f2: "val2"} |
["f1" => "val1", "f2" => "val2"] |
["val1", "val2"] |
["val1", "val2"] |
some_array.push("val") |
$some_array[] = "val" |
some_array.length |
count($some_array) |
some_array.join("") |
implode("", $some_array) |
some_array1.concat(some_array2) |
array_merge($some_array1, $some_array2) |
parseInt("23") |
intval("23") |
parseFloat("23.1") |
floatval("23.1") |
some_str.length |
mb_strlen(some_str, "UTF-8") |
some_str.replace("b", "\b") |
str_replace("b", "\b", $some_str) |
String.fromCharCode(2323) |
chr_unicode(2323) |
input |
$this->input |
options |
$this->options |
error(message, where) |
$this->error(message, where) |
expected(message, where) |
$this->expected(message, where) |
location() |
$this->location() |
range() |
$this->range() |
offset() |
$this->offset() |
text() |
$this->text() |