This is how to convert regex (REGular EXpression) to DFA by creating and using syntax tree in Java language.
This project is a part of a bigger project we'd done for Compiler Course to create a simple compiler.
🔹 Watch this video to comprehend the concepts: https://www.youtube.com/watch?v=uPnpkWwO9hE
Pay attention to some Rules:
NetBeans is the IDE I coded in. you can clone this project and import it in NetBeans easily.
The classes used, are as follows:
- RegexToDfa
- SyntaxTree
- BinaryTree
- Node
- LeafNode
- DfaTraversal
- State
here is an initialize method which is called in the main function:
public static void initialize() {
DStates = new HashSet<>();
input = new HashSet<String>();
String regex = getRegex();
getSymbols(regex);
SyntaxTree st = new SyntaxTree(regex);
root = st.getRoot();
followPos = st.getFollowPos();
State q0 = createDFA();
DfaTraversal dfat = new DfaTraversal(q0, input);
}
-
DStates is a Set of States which is used for creating the final dfa.
-
input is also a Set which holding the characters of the input regular expression taken from user.
pay attention to this issue that, some characters like '*' can be used as an operator (closure, union, concatination, ...)
so if you want to enter these characters just as a normal character, you could bring a backslash '\' following up the intended character
for example "\*" meaning a normal '*' character. and "*" meaning star opeartor (closure)
this is why we use a set of String for declaring input variable. -
String regex = getRegex();
is for getting the intended regular expression from stdin. -
getSymbols(regex);
this line of code sets the input Set. -
SyntaxTree st = new SyntaxTree(regex);
and this line creates the corresponding syntax tree of the inputted regex. -
root = st.getRoot();
gets the root of the syntax tree. -
followPos = st.getFollowPos();
make a new refference to the Set of followpos variable. -
State q0 = createDFA();
creates the DFA through using the syntax tree and assigns q0 the start state of resulted DFA. -
DfaTraversal dfat = new DfaTraversal(q0, input);
makes a new DFA Traversal object for traversing the resulted DFA and recognizing whether the DFA can accept a particular string or not.
/*
***
(a|b)*a => creating binary syntax tree:
.
/ \
* a
/
|
/ \
a b
***
*/
if you look at the SyntaxTree class, you will understand that a BinaryTree object is created.
so a syntax tree is a binary tree with some new special attributes (firstpos, lastpos, nullable, followpos).
in the BinaryTree class, the inputted regex is going to be converted to a tree which contains some nodes.
the last most bottom nodes are called, the leaf nodes.
So, now you know what the Node and LeafNode are used for.
click here to see the acceped result.
click here to see the rejected result.
- ALireza Kavian - BRILACASCK
This project is licensed under the MIT License - see the LICENSE file for details
- Fragmenting section of the regex, using the stack, and also doOperation is adopted by a fine code of @felipemoura in github.