Skip to content

Codegen fixes#1

Merged
jgarzik merged 4 commits intomasterfrom
updates
Jan 14, 2026
Merged

Codegen fixes#1
jgarzik merged 4 commits intomasterfrom
updates

Conversation

@jgarzik
Copy link
Owner

@jgarzik jgarzik commented Jan 14, 2026

No description provided.

jgarzik and others added 2 commits January 13, 2026 20:54
- Add CLAUDE.md with build commands and architecture overview
- Rename basic_compiler_design_spec.md to design.md
- Update README.md link to design.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix stack misalignment in _rt_instr: add sub/add rsp,8 around
  memcmp calls (6 pushes left rsp%16=8, now properly aligned)

- Fix stack misalignment in _rt_strcat: replace push/pop rax with
  16-byte aligned temp storage for saving malloc result across
  memcpy calls

- Fix function return type handling: use type-aware load instead
  of always using movsd xmm0 (now correctly handles Integer, Long,
  Single, Double, and String return types)

- Fix procedure stack frame sizing: replace hardcoded 64-byte
  allocation with dynamic sizing using STACK_RESERVE_PROC placeholder

- Remove macOS Homebrew job from CI workflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jgarzik jgarzik requested a review from Copilot January 14, 2026 02:21
@jgarzik jgarzik self-assigned this Jan 14, 2026
@jgarzik jgarzik added the bug Something isn't working label Jan 14, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements stack alignment fixes for the runtime library and adds dynamic stack size calculation with type-aware function return value handling in the code generator.

Changes:

  • Fixed stack alignment in runtime assembly functions (_rt_instr and _rt_strcat) to comply with System V AMD64 ABI 16-byte alignment requirements
  • Replaced fixed 64-byte stack allocation with dynamic sizing that's calculated after processing function body and patched into the prologue
  • Added type-specific return value loading for INTEGER, LONG, SINGLE, DOUBLE, and STRING function return types

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/runtime/string.s Added stack alignment adjustments in _rt_instr and _rt_strcat; replaced push/pop with aligned memory operations
src/codegen.rs Implemented dynamic stack size calculation with placeholder patching; added type-aware function return value handling
design.md Added comprehensive design specification document (new file)
README.md Updated design specification link to point to design.md
CLAUDE.md Added Claude Code guidance document (new file)
.github/workflows/TestingCI.yml Removed macOS homebrew CI job
Comments suppressed due to low confidence (1)

src/runtime/string.s:257

  • Out-of-bounds read in _rt_instr due to unchecked start position. The _rt_instr runtime function takes a 1-based start index in r8 (moved to rbx), decrements it, and then adjusts the haystack pointer and remaining length with dec rbx; add r12, rbx; sub r13, rbx without validating that start is within [1, haystack_len]. The code generator for INSTR (src/codegen.rs) passes the user-supplied start argument directly (converted from integer/float) into r8 without clamping, so a BASIC program can supply start <= 0 or start > LEN(haystack), causing r12 to point before or beyond the actual string buffer while r13 underflows to a very large value. The subsequent loop uses memcmp(r12, needle, needle_len) and advances r12 one byte at a time while r13 remains large (cmp r13, r15; jb .Linstr_not_found), so memcmp will read outside the haystack buffer into adjacent memory, invoking undefined behavior, with realistic consequences including reading other in-process data (information disclosure) or segfaulting. This is especially dangerous if the runtime is ever used to execute untrusted BASIC code in a long-lived process (e.g., as an embedded scripting engine), since a malicious script can deliberately choose out-of-range start values to scan memory beyond its logical string bounds via repeated equality tests, potentially exposing sensitive data or destabilizing the process.

Severity: HIGH. Confidence: 9

    add r12, rbx            # advance haystack ptr

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jgarzik and others added 2 commits January 13, 2026 21:36
Single-pass solution that tracks DIM declarations and emits
Expr::ArrayAccess for known arrays vs Expr::FnCall for functions.

Removes dead_code annotation from ArrayAccess variant as it's now
properly used. Arrays are distinguished from functions at parse time
by checking if the identifier was previously declared with DIM.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The documentation said [rbp - offset] but the code consistently uses
[rbp + offset] where offset is negative (e.g., -8, -16). Fixed the
comment to match the actual implementation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jgarzik jgarzik merged commit 90cde05 into master Jan 14, 2026
2 checks passed
@jgarzik jgarzik deleted the updates branch January 14, 2026 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant