Skip to content

ozdogrumerve/lexical-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

DFA-based token scanner — Formal Languages & Automata

JavaScript HTML CSS

Live Demo

What it does

Takes a snippet of source code, runs it through a hand-written DFA (Deterministic Finite Automaton), and breaks it down into tokens — keywords, identifiers, numbers, operators, and so on. Every transition the automaton makes is recorded, so you can replay the whole process character by character, forward and back.

Three tabs, three views of the same analysis:

  • Editor — write or load source code, run the analysis, watch tokens appear in real time with color-coded chips. A live state panel shows the current DFA state, the character being read, the lexeme accumulating, and the token just produced. A step log at the bottom records every automaton transition.
  • DFA Diagram — an SVG diagram of the automaton with the active state highlighted as the analysis plays. Includes a full transition table next to the diagram.
  • Statistics — token type distribution as a bar chart, summary metrics (total tokens, source lines, unique lexemes, error count), and a full token detail table with line/column info for every token.

Token types

Type Examples
KEYWORD if else while for return int float bool true false print
IDENTIFIER x oran sonuc myVar
NUMBER 5 42 100
FLOAT 3.14 0.5
ASSIGN =
OPERATOR + - * / < > == != >=
LPAREN / RPAREN ( )
LBRACE / RBRACE { }
SEMICOLON ;
COMMA ,
UNKNOWN anything the DFA can't recognize (e.g. @)

DFA states

START ──letter──► IN_ID ────────────────────────► DONE
      ──digit───► IN_NUM ──dot──► IN_FLOAT ──────► DONE
      ──op──────► IN_OP ──────────────────────────► DONE

Whitespace is skipped before entering the DFA — no epsilon transitions, pure DFA behaviour. Single-character tokens ((, ), {, }, ;, ,) bypass the DFA entirely and produce a token directly.

Getting started

Clone the repo and open index.html. That's it — no install, no build, no node_modules folder haunting your drive.

git clone https://github.com/your-username/lexical-analyzer.git
cd lexical-analyzer

Then just open index.html in your browser. If you want a proper local server instead of a file:// URL:

# Python (usually already on your machine)
python3 -m http.server 5500

# or Node
npx serve .

Navigate to http://localhost:5500 and you're good to go. A default source snippet is pre-loaded so you can hit Analyze right away.

Project structure

├── index.html
├── css/
│   └── style.css
└── js/
    ├── lexer.js        DFA engine — tokenizer, transition table, step recorder
    ├── dfa-diagram.js  SVG diagram renderer and animation controller
    ├── stats.js        statistics computation and chart/table rendering
    └── ui.js           ties everything together, handles playback controls

lexer.js has zero DOM dependencies — it just takes a string and returns { tokens, steps }. Everything visual lives in the other three files.

Controls

Control Action
Analiz Et Start / pause animated playback
İleri / Geri Step forward or backward one DFA transition at a time
Sıfırla Reset everything to the initial state
Speed slider Control playback speed (100ms – 1500ms per step)
Dosya Aç Load a .txt or .json file as source input
JSON Kaydet Export the full token list as JSON

Both the Editor and DFA Diagram tabs share the same playback state — switching tabs mid-animation keeps everything in sync.


⭐ Star this repo if you find it helpful!

Made with ❤️ by Merve Özdoğru

Turing would not be impressed, but it works

About

This project implements a DFA (Deterministic Finite Automaton)-based lexical analyzer. It analyzes source code and breaks it down into tokens while visualizing DFA state transitions. The tool also provides statistical insights, including token counts, unique lexemes, and error rates.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors