parser.php is a script that analyzes IPPcode23.
It takes the code from input, breakes it into smaller pieces, validates its syntax and outputs valid XML representation of IPPCode23.
php8.1 parser.php <[source] [--options] [-flags] [--help]
The whole implementation is just a pipe of specific parts of a parser that are connected with each other.
-
InputReader is a class, responsible for reading contents from input. Besides reading, it also uses trait Formatter that provides 3 methods to format string:
- remove_endings - removes
\n
from the end of the line - remove_comments - removes comments inside of a line
- remove_empty - removes empty lines
Note There is format_line($line) method that combines all the methods above in the same order. This specific order perfectly filters all the lines.
- remove_endings - removes
-
InputAnalyser takes formatted lines, after InputReader is done reading input, and converts each into Instruction object. Also validates header of a program.
- Instruction is a special class to ease work with instructions, that takes a formatted line as an argument for the constructor and validates it while creating new object, using helper class InstructionRule where specific rules for exact instruction are defined.
- Operand is similiar class to Instruction that is being created in pair with Instruction for the same reason.
-
OutputGenerator is a component that produces XML out of instructions, that it takes as the array, generated by InputAnalyzer.
Note The OutputGenerator is only a child of XMLGenerator. That means that object itself doesn't generate the entire XML structure. It uses special methods that are inherited from XMLGenerator.
-
XMLGenerator is configured class that generates main XML structure.
To validate data during all the process, some of the objects are using class collection Validators, where entire validation logic is stored. If validation fails anywhere, ErrorHandler will throw the error.
This program is an interpreter for files that contain programs written in the IPPcode23 language in XML representation.
python3 interpret.py [[--source=[SOURCE_FILE]] [--input=[INPUT_FILE]]] [--help|-h]
--source
specifies the path to the XML file that contains the program to be interpreted.
--input
specifies the path to a file that contains the input data for the program.
If source or input were not provided, the interpreter will wait for input from the standard input stream.
The input XML file must conform to the following format:
<?xml version="1.0" encoding="UTF-8"?>
<program language="IPPcode23">
<instruction order="[ORDER]" opcode="[OPCODE]">
<arg1 type="[TYPE]">[VALUE]</arg1>
<arg2 type="[TYPE]">[VALUE]</arg2>
<arg3 type="[TYPE]">[VALUE]</arg3>
</instruction>
...
</program>
The program element must have a language
attribute with the value IPPcode23.
Each instruction element represents a single instruction in the program, and must have the following attributes:
order
: an integer representing the order of the instruction in the program
opcode
: a string representing the opcode of the instruction
Each instruction element may have up to three arg elements, each with the following attributes:
type
: a string representing the type of the argument ("int", "string", "bool", "nil", "var", "type" or "nil")
value
: the value of the argument, represented as a string
Click here in order to see the entire diagram.
Since the interpret had to be implemented in Python, I have decided to stick with OOP approach as much as it was possible for me.
I divided task into smaller subtasks:
- Handle options, input and basic output
- Parse XML using ETree.
- Implement types using classes. (Var is subset for Symb)
- Implement data stack, frame stack and call stack.
- Create error handling mechanism, that is native to Python.
- Implement static semantic analysis, variable checking. Find all the labels that are being used in the code.
- Implement instructions and operations. Start testing.
For most subtasks was created its class that would implement all the necessary features in order to work correctly.
In the next sections I will note some of these classes and approaches I followed to ease development.
Interpret has its own error codes and messages that needs to be shown to a user in some error cases.
For that I decided to customize already existing error handling in Python by extending Exception
class with attribudes code
and message
.
When error occures, exception can be easily raised with custom message and the program will gracefully exit with pre-defined error code.
For XML parsing interpret is using not only ETree, that I mentioned earlier, but also class Parser
. It's made to analyze correctness and validity of a language hidden behind XML representation (IFJcode23).
When implementing this class, I have noticed that most of its subclasses were sharing the same idea but with different functionalit, so I decided to created abstract class _GenericParseType
.
In interpret Var
literal is a subset for Symb
. To implement such behaviour I have used inheritance (see diagram).