Skip to content

A lightweight 6502 assembler in Rust designed for programmatic use, with optional human-readable listings.

License

Notifications You must be signed in to change notification settings

tommyo123/asm6502

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Assembler6502

A minimal 6502 assembler written in Rust for inline assembly and lightweight compilation. It generates raw machine code from 6502 assembly source and optionally prints a human-readable assembly listing when the listing feature is enabled.

Features

  • Multiple number formats:
    • Hexadecimal: $FF, 0xFF, 0xFFh
    • Binary: %11111111, 0b11111111
    • Decimal: 255
    • Extended range: Values up to 4,294,967,295 (u32) for calculations
  • Expression arithmetic:
    • Addition: $10+5, LABEL+1
    • Subtraction: $FF-10, *-2
    • Multiplication: 10*2
    • Division: 100/5
    • Mixed formats: $10+10+%00000101
    • Operator precedence: *,/ before +,-
    • Parentheses: ($10000-$100)
  • Low/High byte extraction:
    • <expr - Extract low byte (bits 0-7)
    • >expr - Extract high byte (bits 8-15)
    • Works with any expression: #<($10000-RAM_SIZE)
  • Constants: Define reusable values with LABEL = value syntax
    • Simple: SCREEN = $0400
    • Expressions: OFFSET = BASE+$10
    • Current address: HERE = *, NEXT = *+1
    • Memory calculations: TOP_MEM = $10000-$100
  • Modern directives:
    • .byte - Comma-separated bytes
    • .word - 16-bit words (little-endian)
    • .string - ASCII text strings
    • .incbin - Include binary files
  • Label arithmetic: Use labels in expressions (LDA buffer+1, JMP start+3)
  • Current address symbol: * represents the current program counter
  • Addressing mode control:
    • Auto-detection of Zero Page vs Absolute
    • Explicit override with operand prefixes:
      • <$80 → force Zero Page addressing
      • >$80 → force Absolute addressing
  • Adaptive long-branch expansion: Out-of-range branches automatically become BRANCH skip + JMP target
  • Whitespace-friendly: Spaces allowed in operands: LDA #<($10000 - $100)
  • Optional listing output: Print to stdout and/or save to file (feature-gated)
  • Symbol table & address mapping helpers

Syntax Guide

Number Formats

LDA #$FF            ; Hexadecimal
LDA #255            ; Decimal
LDA #%11111111      ; Binary
LDA #0xFF           ; Alternative hex format
LDA #0b11111111     ; Alternative binary format
LDA #$10000         ; Extended range for calculations (65536)

Low/High Byte Operators

; Extract low byte (<)
    LDA #<$1234         ; Low byte = $34
    LDA #<SCREEN        ; Low byte of address
    
; Extract high byte (>)
    LDA #>$1234         ; High byte = $12
    LDA #>SCREEN        ; High byte of address
    
; Common pattern: Load 16-bit address
BUFFER = $2000
    LDA #<BUFFER        ; Low byte
    STA $FC
    LDA #>BUFFER        ; High byte
    STA $FD
    
; With expressions (note: no spaces in operands for best compatibility)
    LDA #<($10000-$100) ; Low byte of $FF00 = $00
    LDX #>($10000-$100) ; High byte of $FF00 = $FF
    
; Real-world memory calculations
RAM_SIZE = $0200
TOP_ADDR = $10000-RAM_SIZE  ; = $FE00
    LDA #<TOP_ADDR      ; $00
    LDX #>TOP_ADDR      ; $FE

Expressions

; Simple arithmetic
LDA #$02+1          ; = $03
LDA #10*2           ; = 20
LDA #100/5          ; = 20
LDA #$FF-10         ; = $F5

; Complex expressions (spaces allowed in operands)
LDA #10*2+5         ; = 25 (precedence: * before +)
LDA #($FF - $10)    ; = $EF (parentheses supported)
LDA #$10+10+%00000101  ; Mixed formats = $1F

; Extended range calculations
LDA #<($10000 - RAM_SIZE)  ; Top-of-memory calculations
LDX #>($10000 - $500 + $100) ; Complex address math

; With labels
LDA buffer+1        ; Address of buffer + 1
JMP start+3         ; Jump to start + 3 bytes

Constants

; Simple constants
SCREEN = $0400
SPRITE_X = 100
MAX_LIVES = 3

; Expression constants
BASE = $1000
OFFSET = BASE+$10   ; = $1010
DOUBLE = SPRITE_X*2 ; = 200

; Low/High byte constants
SCREEN_LO = <SCREEN ; = $00
SCREEN_HI = >SCREEN ; = $04

; Memory calculations
RAM_SIZE = $0200
TOP_MEM = $10000-RAM_SIZE ; = $FE00 (top of memory minus size)
RAM_LO = <TOP_MEM   ; = $00
RAM_HI = >TOP_MEM   ; = $FE

; Current address (*) constants
HERE = *            ; Current program counter
NEXT = *+1          ; Current PC + 1
SKIP = *+3          ; Skip next instruction

; Usage examples
    LDA #SPRITE_X
    STA SCREEN
    JMP HERE        ; Infinite loop

; Practical use of *+offset
skip_target:
RETURN_ADDR = *+1   ; Address of next instruction
    JSR subroutine  ; JSR is 3 bytes
    NOP             ; This is at RETURN_ADDR
    
; Conditional skip pattern
    LDA flag
    BEQ skip_load   ; If zero, skip the load
    LDA #$42        ; This gets skipped if flag=0
skip_load:
    STA result

Directives

Origin and Data

*=$0800             ; Set origin (ORG)
DCB $01 $02 $03     ; Define bytes (space-separated, legacy)

Modern Data Directives

; .byte - Comma-separated bytes
.byte $FF,$FE,$FD
.byte $01,$02,$03,$04

; .word - 16-bit words (little-endian)
.word $1234         ; Assembles to: $34 $12
.word $1234,$5678   ; Assembles to: $34 $12 $78 $56

; .string - ASCII text
.string "HELLO"     ; Assembles to: $48 $45 $4C $4C $4F
.string "6502 ASM"

; .incbin - Include binary file
.incbin "data.bin"  ; Includes entire file as bytes
.incbin "sprite.dat"

Data Directive Comparison

; Old style (still supported)
DCB $01 $02 $03     ; Space-separated

; New style (recommended)
.byte $01,$02,$03   ; Comma-separated

Labels

start:              ; Define label
    LDA #$42
    JMP start       ; Forward/backward references work

buffer:
    DCB $00 $00
    LDA buffer+1    ; Label arithmetic

Complete Example with Memory Calculations

*=$0801

; BASIC stub: SYS 2061
.byte $0B,$08,$0A,$00,$9E,$32,$30,$36,$31,$00,$00,$00

; Zero page pointers
LZSA_SRC = $FC      ; Source pointer (2 bytes)
LZSA_DST = $FE      ; Destination pointer (2 bytes)

; Calculate sizes
RAM_DATA_SIZE = $0150
RELOCATED_SIZE = $00BD

; Top-of-memory calculations
TOP_MEM = $10000
RAM_BLOCK = TOP_MEM - RAM_DATA_SIZE ; = $FEB0
RELOC_ADDR = RAM_BLOCK + RELOCATED_SIZE ; = $FF6D

start:
    ; Load source address (little-endian)
    LDA #<data_block
    STA LZSA_SRC
    LDA #>data_block
    STA LZSA_SRC+1
    
    ; Load destination at top of memory
    LDA #<RAM_BLOCK
    STA LZSA_DST
    LDA #>RAM_BLOCK
    STA LZSA_DST+1
    
    JSR decompress
    RTS

data_block:
    .incbin "compressed.lzsa"
    
decompress:
    ; Decompression routine here...
    RTS

Workspace Layout

This repository is a Cargo workspace with a library crate and a runnable example crate:

asm6502/                         # workspace root
├─ Cargo.toml                    # [workspace] only
├─ asm6502/                      # library crate (the assembler)
│  ├─ Cargo.toml
│  └─ src/
│     ├─ lib.rs
│     ├─ assembler.rs
│     ├─ opcodes.rs
│     ├─ symbol.rs
│     ├─ error.rs
│     ├─ addressing.rs
│     ├─ parser/
│     │  ├─ mod.rs
│     │  ├─ lexer.rs
│     │  ├─ expression.rs
│     │  └─ number.rs
│     └─ eval/
│        ├─ mod.rs
│        └─ expression.rs
└─ example/                      # example binary crate (tests/demo)
   ├─ Cargo.toml
   └─ src/
      └─ main.rs

Library crate asm6502

  • Exposes the public API (Assembler6502, AsmError, etc.)
  • Listing helpers are gated behind the Cargo feature listing

Using the library from another project

Add a Git dependency in your project's Cargo.toml:

[dependencies]
asm6502 = { git = "https://github.com/tommyo123/asm6502" }

Optionally enable the listing feature:

asm6502 = { git = "https://github.com/tommyo123/asm6502", features = ["listing"] }

Basic Example

use asm6502::Assembler6502;

fn main() -> Result<(), asm6502::AsmError> {
    let mut asm = Assembler6502::new();
    
    let code = r#"
        *=$0800
        SCREEN = $0400
        
        start:
            LDA #$42
            STA SCREEN
            JMP start
    "#;

    let bytes = asm.assemble_bytes(code)?;
    println!("Assembled {} bytes", bytes.len());
    
    Ok(())
}

Using Low/High Byte Operators

use asm6502::Assembler6502;

fn main() -> Result<(), asm6502::AsmError> {
    let mut asm = Assembler6502::new();
    
    let code = r#"
        *=$0800
        
        ; Define 16-bit address
        SCREEN = $D800
        
        ; Load address using low/high byte extraction
        load_address:
            LDA #<SCREEN    ; Low byte = $00
            STA $FB
            LDA #>SCREEN    ; High byte = $D8
            STA $FC
            
        ; Top-of-memory calculations
        RAM_SIZE = $0200
        TOP_ADDR = $10000-RAM_SIZE  ; = $FE00
        
            LDA #<TOP_ADDR  ; = $00
            LDX #>TOP_ADDR  ; = $FE
            RTS
    "#;

    let bytes = asm.assemble_bytes(code)?;
    println!("Assembled {} bytes", bytes.len());
    
    Ok(())
}

Using New Directives

use asm6502::Assembler6502;

fn main() -> Result<(), asm6502::AsmError> {
    let mut asm = Assembler6502::new();
    
    let code = r#"
        *=$0800
        
        ; Define data using modern directives
        message:
            .string "HELLO"
            
        values:
            .byte $FF,$FE,$FD,$FC
            
        addresses:
            .word $1000,$2000,$3000
            
        ; Use the data
        start:
            LDX #0
        loop:
            LDA message,X
            STA $0400,X
            INX
            CPX #5
            BNE loop
            RTS
    "#;

    let bytes = asm.assemble_bytes(code)?;
    println!("Assembled {} bytes", bytes.len());
    
    Ok(())
}

Advanced Example with Expressions

use asm6502::Assembler6502;

fn main() -> Result<(), asm6502::AsmError> {
    let mut asm = Assembler6502::new();
    asm.set_origin(0x1000);
    
    let code = r#"
        ; Constants with u32 calculations
        BASE = $2000
        OFFSET = BASE+$100
        COUNT = 10*2
        
        ; Top-of-memory calculations
        RAM_SIZE = $0500
        TOP_MEM = $10000-RAM_SIZE  ; = $FB00
        
        ; Code with expressions
        start:
            LDA #COUNT
            STA BASE
            LDA #<TOP_MEM
            STA $FC
            LDA #>TOP_MEM
            STA $FD
            JMP start
    "#;

    let (bytes, symbols) = asm.assemble_with_symbols(code)?;
    
    println!("Assembled {} bytes", bytes.len());
    println!("\nSymbols:");
    for (name, addr) in symbols.iter() {
        println!("  {} = ${:04X}", name, addr);
    }
    
    Ok(())
}

With Listing (feature-gated)

use asm6502::Assembler6502;

fn main() -> Result<(), asm6502::AsmError> {
    let mut asm = Assembler6502::new();
    
    let code = r#"
        *=$0800
        
        data:
            .byte $01,$02,$03
            .word $1234
            .string "HI"
            
        start:
            LDA #$42
            STA $0200
            RTS
    "#;

    #[cfg(feature = "listing")]
    {
        let (bytes, items) = asm.assemble_full(code)?;
        asm.print_assembly_listing(&items);
        asm.save_listing(&items, "output.lst")?;
    }
    
    #[cfg(not(feature = "listing"))]
    {
        let bytes = asm.assemble_bytes(code)?;
    }
    
    Ok(())
}

Example crate (this repository)

The workspace includes a comprehensive test suite demonstrating all features:

Run the example

From the workspace root:

# Run test suite
cargo run -p asm6502-example

# Run with listing enabled
cargo run -p asm6502-example --features listing

The test suite covers:

  1. Number formats (hex, decimal, binary)
  2. Expression arithmetic
  3. Label arithmetic
  4. Mixed number formats
  5. Constants (LABEL = value)
  6. Current address usage (*)
  7. New directives (.byte, .word, .string)
  8. U32 support and memory calculations
  9. Low/high byte operators
  10. Complete 6502 program (all addressing modes)

API Overview

Core Methods

// Simple assembly
fn assemble_bytes(&mut self, src: &str) -> Result<Vec<u8>, AsmError>

// Assembly with symbol table
fn assemble_with_symbols(&mut self, src: &str) 
    -> Result<(Vec<u8>, HashMap<String, u16>), AsmError>

// Full assembly with items (for listing)
fn assemble_full(&mut self, src: &str) 
    -> Result<(Vec<u8>, Vec<Item>), AsmError>

// Assembly with address mapping
fn assemble_with_addr_map(&mut self, src: &str)
    -> Result<(Vec<u8>, Vec<(usize, u16)>), AsmError>

// Configuration
fn set_origin(&mut self, addr: u16)
fn origin(&self) -> u16
fn reset(&mut self)

// Symbol inspection
fn symbols(&self) -> &HashMap<String, u16>
fn lookup(&self, name: &str) -> Option<u16>

// Binary output
fn write_bin<W: Write>(bytes: &[u8], w: W) -> io::Result<()>

Listing Methods (feature-gated)

#[cfg(feature = "listing")]
fn print_assembly_listing(&self, items: &[Item])

#[cfg(feature = "listing")]
fn save_listing(&self, items: &[Item], filename: &str) -> io::Result<()>

Directive Reference

Directive Syntax Description Example
*= *=$0800 Set origin address *=$C000
DCB DCB $01 $02 Define bytes (space-separated) DCB $FF $00
.byte .byte $01,$02 Define bytes (comma-separated) .byte $01,$02,$03
.word .word $1234 Define 16-bit words (little-endian) .word $1000,$2000
.string .string "text" Define ASCII string .string "HELLO"
.incbin .incbin "file" Include binary file .incbin "data.bin"
LABEL = CONST = $42 Define constant SCREEN = $0400

Operator Reference

Operator Description Example Result
+ Addition $10+5 $15
- Subtraction $FF-10 $F5
* Multiplication 10*2 20
/ Division 100/5 20
< Low byte (bits 0-7) <$1234 $34
> High byte (bits 8-15) >$1234 $12
* Current address LABEL=* Current PC
() Grouping ($10+$20)*2 $60

Building & Docs

Build the workspace:

cargo build

Build only the library:

cargo build -p asm6502

Run tests:

cargo test

Generate local API docs:

cargo doc --open

Design Philosophy

This assembler is designed for:

  • Inline assembly: Quick compilation of small code snippets
  • JIT compilation: Runtime generation of 6502 code
  • Emulator testing: Dynamic test case generation
  • Educational tools: Interactive 6502 learning
  • Real-world 6502 projects: With extended u32 support for memory calculations
  • Simplicity: Minimal dependencies, clear code structure

It intentionally does not include:

  • Multi-file projects or linking
  • Macro systems
  • Complex multi-pass constant resolution
  • Object file formats

Constants and expressions are evaluated in-order, requiring definitions before use (except labels which support forward references).

Technical Notes

  • Forward references: Labels support forward references, constants do not
  • Best practice: Define constants at the top of your source
  • Expression evaluation: Left-to-right with standard precedence (*, / before +, -)
  • Branch range: Automatic long-branch expansion for out-of-range branches
  • Word endianness: .word directive outputs little-endian (6502 native format)
  • String encoding: .string uses standard ASCII encoding
  • Binary inclusion: .incbin reads files relative to working directory
  • U32 support: Internal calculations use 32-bit unsigned integers, automatically wrapping to 16-bit for addresses
  • Memory calculations: Supports expressions like $10000-RAM_SIZE for top-of-memory calculations
  • Whitespace in operands: Spaces are allowed in operands (split on first whitespace only)
  • Current address arithmetic:
    • LABEL = * captures the current program counter
    • LABEL = *+n useful for calculating addresses of upcoming instructions
    • Example: RETURN_ADDR = *+1 before a JSR captures the return address
    • Can be used for self-modifying code or address table generation

Version History

v2.1 (Current)

  • Added u32 expression support for values > 65535 (e.g., $10000)
  • Added low/high byte operators (< and >)
  • Improved lexer to allow whitespace in operands
  • Enhanced long branch expansion algorithm
  • Better support for real-world memory calculations

v2.0

  • Added .byte, .word, .string, and .incbin directives
  • Improved listing output for data directives
  • Enhanced address mapping support

v1.0

  • Initial release with core 6502 assembly support
  • Expression evaluation and constants
  • Adaptive long-branch expansion
  • Optional listing feature

License

This project is released under The Unlicense. It is free and unencumbered software released into the public domain.

About

A lightweight 6502 assembler in Rust designed for programmatic use, with optional human-readable listings.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages