Skip to content

Assembler#6

Merged
LunaStev merged 6 commits intowavefnd:masterfrom
LunaStev:assembler
Feb 2, 2026
Merged

Assembler#6
LunaStev merged 6 commits intowavefnd:masterfrom
LunaStev:assembler

Conversation

@LunaStev
Copy link
Member

@LunaStev LunaStev commented Feb 2, 2026

No description provided.

Add support for data directives (db, dw, dd, dq) and implement a
complete CLI interface for the assembler toolchain.

Changes:
- Refactor AST to rename LabelRef to Label and simplify MemoryOperand
  structure (scale and disp are now non-optional)
- Implement directive encoding for db/dw/dd/dq with proper byte layout
- Add comprehensive operand parsing including memory operands with
  base, index, scale, and displacement support
- Implement directive parsing with support for numbers, strings, and
  identifiers
- Add tokenization for multiply operator (*) needed for memory scaling
- Create CLI module with command routing and argument parsing
- Implement asm command with architecture selection (--amd64/--aarch64)
- Add debug mode (--debug-whale) with developer options:
  - Token, AST, and byte array inspection
  - Hex and binary dump output
  - JSON export
  - Statistics and execution tracing
- Add comprehensive CLI documentation with usage examples
- Include test assembly file demonstrating directive usage

The assembler now functions as a complete command-line tool with
both user-friendly quiet mode and developer-focused debug mode.
Extend register handling to support 32-bit, 16-bit, and 8-bit registers
in addition to existing 64-bit registers.

Changes:
- Add REGISTERS_32 lookup table with eax, ebx, ecx, edx, esp, ebp, esi, edi
- Refactor lookup_reg() to accept bit-width parameter (32 or 64)
- Extend is_register() validator to recognize:
  - 32-bit registers (eax, ebx, ecx, edx, esi, edi, ebp, esp)
  - 16-bit registers (ax, bx, cx, dx, si, di, bp, sp)
  - 8-bit registers (al, bl, cl, dl, ah, bh, ch, dh)
- Clean up unused import (ParserError)

This lays the groundwork for supporting multi-width operands in
instruction encoding.
Add low-level encoding structures for ModR/M, SIB, REX prefix, and
memory addressing in the AMD64 assembler.

Changes:
- Add ModRM struct with mod, reg, and rm fields for operand encoding
- Add SIB (Scale-Index-Base) struct for complex memory addressing
- Add REX prefix struct for 64-bit operand size and register extensions
- Implement address encoding logic supporting:
  - Direct register addressing [reg]
  - Displacement modes (disp8, disp32)
  - Special handling for rbp base register
- Add encoding module with public exports for all primitives
- Implement encode() methods returning proper byte representation

These primitives form the foundation for complete x86-64 instruction
encoding with proper operand and addressing mode support.
Add support for register-to-register mov instruction encoding with
proper REX prefix and ModR/M byte handling.

Changes:
- Implement mov r64, r64 encoding in encoder.rs:
  - Generate REX.W prefix for 64-bit operand size
  - Set REX.R for source registers r8-r15
  - Set REX.B for destination registers r8-r15
  - Emit opcode 0x89 (MOV r/m64, r64)
  - Construct ModR/M byte with mod=11b for register mode
- Refactor encode_address() to return structured EncodedAddress:
  - Replace ModRM/SIB structs with raw bit fields
  - Add DispKind enum for displacement size (Disp8/Disp32)
  - Include REX extension flags (rex_b, rex_x) in result
  - Add comprehensive documentation for addressing modes
  - Explicitly mark unimplemented features (index, scale, SIB)
- Add test case with mov rax, rbx and mov r8, r9

The encoder now properly handles extended registers (r8-r15) and
generates correct machine code for basic register moves.
@LunaStev LunaStev self-assigned this Feb 2, 2026
@LunaStev LunaStev merged commit d356cf1 into wavefnd:master Feb 2, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant