TransPYler is a compiler-like project designed to translate a simplified subset of Python, called Fangless Python, into another target programming language (to be defined).
The project is divided into multiple stages of compilation:
- Lexical Analysis (Lexer) — ✅ implemented
- Syntactic Analysis (Parser) — ✅ implemented
- Semantic Analysis — 🚧 future work
- Transpilation/Code Generation — 🚧 future work
At this stage, the project includes the Lexer and Parser, which together can scan Fangless Python source code, produce a stream of tokens, and construct an Abstract Syntax Tree (AST) representing the program structure.
- Implemented:
- Lexer for Fangless Python using PLY (Python Lex-Yacc)
- Parser that constructs an AST from tokenized input
- AST visualization tools (Rich, ASCII diagrams, Mermaid)
- Pending: Semantic checks and transpilation to the target language
This README serves as a reference for the full transcompiler. Usage examples and tests cover both the Lexer and Parser implementations.
- Recognizes keywords (
if, else, elif, while, for, def, return, class, True, False, None, and, or, not, in, break, continue, pass...) - Identifies identifiers, numeric and string literals, and operators (
+, -, *, /, //, %, **, ==, !=, <, >, <=, >=, =, +=, -=, *=, /=, //=, %=, **=...) - Supports delimiters:
( ) [ ] { } : , . - Handles comments starting with
# - Detects indentation levels, generating special tokens
INDENTandDEDENT - Reports lexical errors (unknown characters, invalid escapes, indentation mistakes)
- Constructs an Abstract Syntax Tree (AST) from token streams
- Supports expressions:
- Literals (numbers, strings, booleans, None)
- Binary operators (arithmetic, logical, comparison)
- Unary operators (negation, logical NOT)
- Data structures (tuples, lists, dictionaries, sets)
- Function calls, attribute access, subscripting
- Slicing notation (
[start:stop:step])
- Supports statements:
- Assignments (simple and augmented:
=,+=,-=, etc.) - Control flow (
if/elif/else,while,for) - Function and class definitions
return,break,continue,pass
- Assignments (simple and augmented:
- Implements operator precedence following Python rules
- Reports syntax errors with contextual error messages
- Provides AST visualization in multiple formats
- Python 3.x
- Git + GitHub
- PLY (Python Lex-Yacc)
- Rich (optional, for enhanced AST visualization)
# Clone the repository
git clone https://github.com/Andres-QG/TransPYler.git
cd TransPYler
# (Optional) virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txtpython -m src.testers.manual_tester <test> <expect>- test: Path to a file containing Fangless Python (.flpy) code for testing
- expect: Path to a file containing the expected sequence of tokens
Test (strings_and_indent.flpy):
# Function
def f():
s1 = "Quote\"mark"
s2 = 'Back\\slash'
s3 = ''
return s1Expected Tokens (strings_and_indent.expect):
DEF "def"
ID "f"
LPAREN "("
RPAREN ")"
COLON ":"
INDENT
ID "s1"
ASSIGN "="
STRING "Quote"mark"
ID "s2"
ASSIGN "="
STRING "Back\slash"
ID "s3"
ASSIGN "="
STRING ""
RETURN "return"
ID "s1"
DEDENT
Command:
python -m src.testers.manual_tester strings_and_indent.flpy strings_and_indent.expectOutput:
✅ Test passed: All tokens match expected output
The parser can generate and visualize ASTs from Fangless Python source code.
python -m src.tools.ast_cli [--expr EXPRESSION | --file PATH] [--out JSON_PATH] [--view {expr,generic,diagram,mermaid}] [--unwrap-expr]--expr EXPRESSION: Parse an inline expression--file PATH: Parse a source file (.py/.flpy)--out JSON_PATH: Output path for AST JSON (default:ast.jsonin repo root)--view {expr,generic,diagram,mermaid}: Visualization format (default:expr)expr: Expression-focused tree view (requires Rich)- Note: This view is optimized for pure expressions (e.g.,
2 + 3,foo(bar)). When visualizing statements (Module, FunctionDef, Assign, etc.), it falls back to the generic view, so both views will appear identical for full programs.
- Note: This view is optimized for pure expressions (e.g.,
generic: Generic AST tree view (requires Rich)diagram: ASCII art tree diagrammermaid: Mermaid diagram syntax (saved to.mmdfile)
--unwrap-expr: Return bare expression when input is a single expression- Only unwraps when the AST is
Module→ExprStmt→ expression. Has no effect on statements like function definitions.
- Only unwraps when the AST is
Parse an inline expression:
python -m src.tools.ast_cli --expr "2 + 3 * 4" --view diagramParse a file and view as Rich tree:
python -m src.tools.ast_cli --file tests/parser/test_parser_ast.flpy --view exprGenerate Mermaid diagram:
python -m src.tools.ast_cli --file tests/parser/test_parser_ast.flpy --view mermaidParse and save to specific location:
python -m src.tools.ast_cli --expr "x = [1, 2, 3]" --out output/my_ast.jsonInputted code
# comment
def fun(a,b):
"""
Docstring
"""
if a < b:
print("Hello World! \n")Rich AST
Mermaid AST
graph TD
N0["Module"]
N0 --> N1
N1["FunctionDef: fun"]
N1 --> N2
N2["Identifier: a"]
N1 --> N3
N3["Identifier: b"]
N1 --> N4
N4["Block"]
N4 --> N5
N5["ExprStmt: LiteralExpr(line=4, col=29,..."]
N5 --> N6
N6["LiteralExpr: '\n Docstring\n '"]
N4 --> N7
N7["If"]
N7 --> N8
N8["ComparisonExpr (<)"]
N8 --> N9
N9["Identifier: a"]
N8 --> N10
N10["Identifier: b"]
N7 --> N11
N11["Block"]
N11 --> N12
N12["ExprStmt: CallExpr(line=6, col=82, ca..."]
N12 --> N13
N13["CallExpr"]
N13 --> N14
N14["Identifier: print"]
N13 --> N15
N15["LiteralExpr: 'Hello World! \n'"]
TransPYler/
├── src/
│ ├── core/
│ │ ├── __init__.py
│ │ ├── ast/
│ │ │ ├── __init__.py
│ │ │ ├── ast_base.py
│ │ │ ├── ast_definitions.py
│ │ │ ├── ast_expressions.py
│ │ │ └── ast_statements.py
│ │ ├── symbol_table.py
│ │ └── utils.py
│ │
│ ├── lexer/
│ │ ├── __init__.py
│ │ ├── indentation.py
│ │ ├── lexer.py
│ │ └── tokens.py
│ │
│ ├── parser/
│ │ ├── __init__.py
│ │ ├── parser.py
│ │ ├── parser_blocks.py
│ │ ├── parser_conditionals.py
│ │ ├── parser_definitions.py
│ │ ├── parser_expressions.py
│ │ ├── parser_loops.py
│ │ ├── parser_statements.py
│ │ └── parser_utils.py
│ │
│ ├── testers/
│ │ ├── __init__.py
│ │ ├── lexer/
│ │ └── parser/
│ │
│ ├── tools/
│ │ ├── __init__.py
│ │ ├── ast_cli.py
│ │ └── ast_viewer.py
│ │
│ └── __init__.py
│
├── tests/
│ ├── lexer/
│ └── parser/
│
├── doc/
│ ├── lexer_design.md
│ └── parser_design.md
│
├── .gitignore
├── pytest.ini
├── README.md
└── requirements.txt
Read about TransPYler's lexer design here
Read about TransPYler's parser design here
The parser generates an AST that represents the hierarchical structure of Fangless Python programs. The AST consists of various node types:
LiteralExpr: Numeric, string, boolean, and None literalsIdentifier: Variable and function namesUnaryExpr: Unary operations (-x,not y)BinaryExpr: Binary operations (x + y,a and b)ComparisonExpr: Comparison operations (x < y,a == b)CallExpr: Function calls (func(args))TupleExpr,ListExpr,SetExpr,DictExpr: Collection literalsAttribute: Attribute access (obj.attr)Subscript: Subscripting and slicing (list[0],list[1:5:2])
Assign: Assignment statements (including augmented assignments)ExprStmt: Expression statementsReturn: Return statementsBreak,Continue,Pass: Control flow statementsIf: Conditional statements with elif and elseWhile: While loopsFor: For loopsFunctionDef: Function definitionsClassDef: Class definitionsBlock: Statement blocks
Module: Top-level container representing a complete source file
-
Code and documentation are written in English.
-
Git workflow:
- Branch naming:
TASK_<#>_<BriefDescription> - Contributions via Pull Requests only.
- Branch naming:
-
Code must be clean, modular, and documented.
- Unit tests for token recognition
- Integration tests with Fangless Python snippets
- Error cases: invalid characters, indentation, escape sequences, syntax errors
- Parser tests for AST generation and correctness
This project uses pytest for testing.
-
Install dependencies
Make sure you have installed all requirements first:pip install -r requirements.txt
-
Run the full test suite
From the project root, run:pytest
By default, pytest will automatically discover all tests with
test_in their name. -
Run tests with more detailed output
pytest -v
The
-v(verbose) flag shows each test name and its result. -
Run a specific test file
pytest src/testers/test_lexer.py
-
Run parser tests specifically
pytest src/testers/parser/
-
Stop at the first failure
pytest -x
- Phase 1 — Lexer: ✅ Completed
- Phase 2 — Parser: ✅ Completed
- AST construction from token stream
- Support for expressions, statements, and control flow
- Operator precedence and associativity
- Error reporting with context
- AST visualization tools
- Phase 3 — Semantic Analysis: 🚧 Planned
- Type checking
- Symbol table management
- Scope analysis
- Semantic error detection
- Phase 4 — Code Generation: 🚧 Planned
- Translate Fangless Python AST into target language
- Optimization passes
- Runtime library integration
| Name | Role/Contribution | |
|---|---|---|
| Andrés Quesada-González | andresquesadagon4@gmail.com | Lexer: Operator and literal token definition, documentation, project structure, test scripts, test cases, python modules |
| Parser: Function and Class definitions, Syntax error handling, Mermaid AST viewer, python modules,documentation | ||
| David Obando-Cortés | david.obandocortes@ucr.ac.cr | Lexer: Indentation Handling, Keywords definition |
| Parser: | ||
| Randy Agüero-Bermúdez | randy.aguero@ucr.ac.cr | Lexer: Testing, comment handling, Identifier token definition recognition |
| Parser: |
