Skip to content

[CRITICAL] Transpiler generates invalid Rust code - must use RefCell/Mutex not unsafe #132

@noahgift

Description

@noahgift

[BUG] Transpiler generates invalid Rust code for global mutable state

Summary

The Ruchy transpiler generates invalid Rust code when functions access top-level let mut variables. The generated code:

  1. Places let mut declarations inside main() instead of module level
  2. References these variables from separate functions where they're out of scope
  3. Results in rustc compilation errors: cannot find value in this scope

Critical: This bug causes the transpiler to generate Rust code that will NEVER compile, violating the core principle that Ruchy should only emit valid, safe Rust code.

Environment

  • Ruchy Version: v3.193.0 (commit 7534695, trunk)
  • Platform: Linux 6.8.0-85-generic x86_64
  • Debugging Tools: ruchydbg v1.26.0
  • rustc: 1.83.0-nightly

Reproduction Steps

Simple Test Case

File: simple_global.ruchy

let mut counter = 0

fun increment() {
    counter = counter + 1
}

fun main() {
    increment()
    println(counter)
}

Steps to reproduce:

# 1. Create test file
cat > simple_global.ruchy << 'RUCHY'
let mut counter = 0

fun increment() {
    counter = counter + 1
}

fun main() {
    increment()
    println(counter)
}
RUCHY

# 2. Verify interpreter works (expected: outputs "1")
ruchy run simple_global.ruchy

# 3. Transpile to Rust
ruchy transpile simple_global.ruchy -o output.rs

# 4. Attempt to compile (FAILS)
rustc output.rs

Expected output from Step 2 (interpreter): 1 ✅ WORKS

Actual transpiled code (Step 3):

fn increment() {
    unsafe { counter = counter + 1 }  // ❌ ERROR: counter not in scope
}
fn main() {
    unsafe {
        increment();
        println!("{:?}", counter);  // ❌ ERROR: counter not in scope
    }
}

rustc errors (Step 4):

error[E0425]: cannot find value `counter` in this scope
 --> output.rs:2:14
  |
2 |     unsafe { counter = counter + 1 }
  |              ^^^^^^^ not found in this scope

error[E0425]: cannot find value `counter` in this scope
 --> output.rs:2:24
  |
2 |     unsafe { counter = counter + 1 }
  |                        ^^^^^^^ not found in this scope

error[E0425]: cannot find value `counter` in this scope
 --> output.rs:7:26
  |
7 |         println!("{:?}", counter);
  |                          ^^^^^^^ not found in this scope

error: aborting due to 3 previous errors

Complex Test Case (Multiple Functions)

File: complex_global.ruchy

let mut global_state = 0

fun modify_global(value) {
    global_state = value
}

fun get_global() {
    global_state
}

fun main() {
    modify_global(42)
    let result = get_global()
    println("Result:", result)
}

Same steps as simple case - Results in similar compilation failures with multiple functions all unable to access global_state.

Debugging Evidence (ruchydbg v1.26.0)

Step 1: Tokenization ✅

$ ruchydbg tokenize simple_global.ruchy
Token Stream:
=============
Token #1: Let
Token #2: Mut
Token #3: Identifier("counter")
Token #4: Equal
Token #5: Integer(0)
...
Total tokens: 30

Result: ✅ All tokens correctly identified

Step 2: Parse Trace ✅

$ ruchydbg trace simple_global.ruchy --analyze
Parser Trace (with root cause analysis):
========================================
✅ Parse successful - no errors detected

Result: ✅ AST built correctly

Step 3: Interpreter Execution ✅

$ ruchydbg run simple_global.ruchy --timeout 5000 --trace
🔍 Running: simple_global.ruchy
⏱️  Timeout: 5000ms
🔍 Type-aware tracing: enabled

TRACE: → main()
TRACE: → increment()
TRACE: ← increment = 1: integer
1
TRACE: ← main = nil: nil

⏱️  Execution time: 2ms
✅ SUCCESS

Result: ✅ Execution successful, output correct, 2ms performance

Step 4: Transpiler ❌

$ ruchy transpile simple_global.ruchy -o output.rs
$ cat output.rs
fn increment() {
    unsafe { counter = counter + 1 }
}
fn main() {
    unsafe {
        increment();
        println!("{:?}", counter);
    }
}

$ rustc output.rs
error[E0425]: cannot find value `counter` in this scope (3 occurrences)

Result: ❌ Generated code does NOT compile

Root Cause Analysis

Problem: The transpiler is NOT generating the required module-level static mut declaration.

What SHOULD be generated:

static mut counter: i32 = 0;  // ❌ MISSING from transpiled output!

fn increment() {
    unsafe { counter = counter + 1 }
}
fn main() {
    unsafe {
        increment();
        println!("{:?}", counter);
    }
}

This code would compile and run correctly.

What IS being generated: Code with unsafe blocks but NO variable declarations, causing all references to fail.

Why This Violates Core Principles

1. Ruchy Must NEVER Emit Invalid Code

The Ruchy project's core value proposition is "Python syntax with Rust performance." This means:

  • Interpreter mode: Must execute correctly (WORKS)
  • Bytecode mode: Must execute correctly (WORKS)
  • Transpile mode: Must generate VALID, COMPILABLE Rust code (BROKEN)
  • Compile mode: Must produce working native binaries (BROKEN - uses transpiler)

Current state violates this principle: The transpiler generates Rust code that rustc rejects.

2. Unsafe Code is Unacceptable

Even more critically, THE TRANSPILER SHOULD NEVER GENERATE unsafe CODE.

Rust's unsafe keyword indicates code that:

  • Bypasses Rust's safety guarantees
  • Can cause undefined behavior if misused
  • Requires expert review and validation
  • Is inappropriate for generated code from a high-level language

Why unsafe is wrong here:

  • Safety: Ruchy is a high-level language. Users expect memory safety without unsafe.
  • Audit trail: Every unsafe block in Rust requires careful human review. Generated code should NEVER require this.
  • Rust best practices: Modern Rust avoids unsafe whenever possible. The standard library uses safe abstractions.
  • User expectations: Python programmers coming to Ruchy expect safety, not manual memory management.

3. Correct Solution: Safe Rust Abstractions

Instead of static mut + unsafe, the transpiler should generate SAFE Rust code using standard abstractions:

Option A: std::cell::RefCell (thread-local):

use std::cell::RefCell;

thread_local! {
    static COUNTER: RefCell<i32> = RefCell::new(0);
}

fn increment() {
    COUNTER.with(|c| *c.borrow_mut() += 1);
}

fn main() {
    increment();
    COUNTER.with(|c| println!("{}", *c.borrow()));
}

Option B: std::sync::Mutex (thread-safe):

use std::sync::Mutex;
use lazy_static::lazy_static;

lazy_static! {
    static ref COUNTER: Mutex<i32> = Mutex::new(0);
}

fn increment() {
    *COUNTER.lock().unwrap() += 1;
}

fn main() {
    increment();
    println!("{}", *COUNTER.lock().unwrap());
}

Option C: std::sync::RwLock (optimized for reads):

use std::sync::RwLock;
use lazy_static::lazy_static;

lazy_static! {
    static ref COUNTER: RwLock<i32> = RwLock::new(0);
}

fn increment() {
    *COUNTER.write().unwrap() += 1;
}

fn main() {
    increment();
    println!("{}", *COUNTER.read().unwrap());
}

All three options:

  • ✅ Compile without errors
  • ✅ Use ZERO unsafe code
  • ✅ Follow Rust best practices
  • ✅ Provide memory safety guarantees
  • ✅ Work correctly in single and multi-threaded contexts

Impact Assessment

Severity: CRITICAL - Transpiler fundamentally broken

Affected Functionality:

  • ruchy transpile - Generates invalid code (BLOCKS all transpilation)
  • ruchy compile - Cannot work (uses broken transpiler internally)
  • ❌ Benchmarking: 2/10 execution modes unavailable (transpile, compile)
  • ❌ Any user code with global mutable state CANNOT transpile

Working Functionality:

  • ruchy run (interpreter) - Works perfectly
  • ruchy --vm-mode bytecode run - Works perfectly
  • ✅ All other execution modes unaffected

User Impact:

  • Users cannot transpile ANY code with global mutable state
  • ruchy compile mode is completely unusable for this pattern
  • Benchmarking coverage reduced from 10/10 to 8/10 modes (20% loss)
  • Documentation cannot recommend transpile/compile modes

Proposed Fix

  1. Immediate (Minimal):

    • Generate static mut declarations at module level
    • Continue using unsafe blocks (NOT IDEAL, but unblocks users)
  2. Correct (Recommended):

    • Use RefCell for single-threaded globals
    • Use Mutex/RwLock for thread-safe globals
    • Emit ZERO unsafe code
    • Follow Rust best practices
  3. Long-term:

    • Add compile flag: --thread-safe to choose RefCell vs Mutex
    • Default to RefCell (faster, matches Python semantics)
    • Provide RwLock for read-heavy workloads

Testing Recommendations

Add transpiler tests:

#[test]
fn test_global_mutable_state_single_function() {
    let source = r#"
        let mut counter = 0
        fun main() {
            counter = counter + 1
            println(counter)
        }
    "#;
    
    let transpiled = transpile(source).unwrap();
    assert!(rustc_compile(&transpiled).is_ok(), "Transpiled code must compile");
    
    let output = run_transpiled(&transpiled).unwrap();
    assert_eq!(output.trim(), "1");
}

#[test]
fn test_global_mutable_state_multiple_functions() {
    let source = r#"
        let mut counter = 0
        fun increment() { counter = counter + 1 }
        fun main() {
            increment()
            println(counter)
        }
    "#;
    
    let transpiled = transpile(source).unwrap();
    assert!(rustc_compile(&transpiled).is_ok(), "Transpiled code must compile");
    
    let output = run_transpiled(&transpiled).unwrap();
    assert_eq!(output.trim(), "1");
}

#[test]
fn test_transpiled_code_contains_no_unsafe() {
    let source = r#"
        let mut counter = 0
        fun increment() { counter = counter + 1 }
        fun main() { increment(); println(counter) }
    "#;
    
    let transpiled = transpile(source).unwrap();
    assert!(!transpiled.contains("unsafe"), "Transpiled code must NOT contain unsafe blocks");
}

References

  • Commit with attempted fix: 7534695 [TRANSPILER-SCOPE] Fix top-level let mut variables with functions
  • Verification: Used ruchydbg v1.26.0 for comprehensive analysis
  • Related: PARSER-079 (labeled loops) - FIXED in v3.193.0 ✅

Bottom Line

The transpiler generates INVALID RUST CODE that will never compile. This violates Ruchy's core mission and blocks critical functionality.

The fix must:

  1. Generate code that compiles without errors
  2. Use ZERO unsafe code (use RefCell/Mutex instead)
  3. Follow Rust best practices and safety guarantees
  4. Pass the test suite above

Priority: CRITICAL - This blocks transpile/compile modes entirely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions