Lexical-Analyser

Lexical Rules for the language specified below. Files - lex.cpp, lex.h

The language has identifiers, referred to by ident terminal, which are defined to be a letter followed by zero or more letters or digit. It is defined as: ident := Letter {(Letter|Digit|_)} Letter := [a-z A-Z] Digit := [0-9] The token for an identifier is IDENT.
Integer constants, referred to by iconst terminal, are defined as one or more digits. It is defined as: iconst := [0-9]+ The token for an integer constant is ICONST.
Real constants, referred to by rconst terminal, are defined as zero or more digits followed by a decimal point (dot) and one or more digits. It is defined as: Rconst := ([0-9]*).([0-9]+) The token for a real constant is RCONST. For example, real number constants such as 12.0 and .2 are accepted, but 2. is not.
String literals, referred to by sconst terminal, are defined as a sequence of characters delimited by double quotes, that should all appear on the same line. The assigned token for a string constant is SCONST. For example, “Hello to CS 280.” is a string literal.
The reserved words of the language are: program, end, begin, write, if, int, float, string, repeat. These reserved words have the following tokens, respectively: PROGRAM, END, BEGIN, WRITE, IF, INT, FLOAT, STRING and REPEAT.
The operators of the language are: +, -, *, /, %, =, (, ), ==, >. These operators are for plus, subtract, multiply, divide, remainder, assignment, left parenthesis, right parenthesis, equality, and greater than operations. They have the following tokens, respectively: PLUS, MINUS, MULT, DIV, REM, ASSOP, LPAREN, RPAREN, EQUAL, GTHAN.
The semicolon and comma characters are terminals with the following tokens: SEMICOL, COMMA.
A comment is defined by all the characters following the exclamation mark “#” to the end of the line. A comment does not overlap one line. A recognized comment is ignored and does not have a token.
White spaces are skipped. However, white spaces between tokens are used to improve readability and can be used as a one way to delimit tokens.
An error will be denoted by the ERR token.
End of file will be denoted by the DONE token.

Parser

Parser to the language is based on the following grammar rules specified in EBNF notations. Files - Parse.cpp, Parse.h

Prog ::= PROGRAM IDENT StmtList END PROGRAM

DeclStmt ::= (INT | FLOAT) IdentList

IdentList ::= IDENT, {, IDENT}

StmtList ::= Stmt; {Stmt;}

Stmt ::= DeclStmt | ControlStmt

ControlStmt ::= AssigStmt | IfStmt | WriteStmt

WriteStmt ::= WRITE ExprList

IfStmt ::= IF (LogicExpr) ControlStmt

AssignStmt ::= Var = Expr

ExprList ::= Expr {, Expr}

Expr ::= Term {(+|-) Term}

Term ::= SFactor {( *| / | % ) SFactor}

SFactor ::= (+ | -) Factor | Factor

LogicExpr ::= Expr (== | >) Expr

Var ::= IDENT

Factor = IDENT | ICONST | RCONST | SCONST | (Expr)

Interpretor

Implementation for the interpreter is build on top of the code for parser using the same 2 rules above. Files - ParseInt.cpp, ParseInt.h

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
lex.cpp		lex.cpp
lex.h		lex.h
main.cpp		main.cpp
parse.cpp		parse.cpp
parse.h		parse.h
parseInt.cpp		parseInt.cpp
parseInt.h		parseInt.h
val.cpp		val.cpp
val.h		val.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lexical-Analyser

Parser

Interpretor

About

Uh oh!

Releases

Packages

Languages

aron-s/Custom-Compiler

Folders and files

Latest commit

History

Repository files navigation

Lexical-Analyser

Parser

Interpretor

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages