Skip to content

JIT Compiler #34

@pjcavalcanti

Description

@pjcavalcanti

Add a compiled mode that generates C code for defs, builds a single shared
library, loads it, and uses compiled defs when reducing REF in WNF.
Use the -c flag to indicate compiled mode. As in `hvm4 main.hvm4 -c".

The compiled C code should perform the WNF of applying the def to the arguments,
so its use in hvm4's WNF would be consistent with its lazy evaluation.

1) High-level Flow

  1. Parse HVM4 file and build BOOK as usual.
  2. Generate one C file containing:
    • an ABI header
    • one C function per def
    • a registration function
  3. Compile the C file to a .so.
  4. dlopen the .so and register compiled defs into a function table.
  5. In WNF REF, if compiled def exists and its args are saturated (all present), call it;
    otherwise fall back to the normal ALO path.

2) Runtime Integration Points (Where It Plugs In)

  • REF dispatch: clang/wnf/_.c case REF: is the entry point.
  • Book defs: still stored in BOOK for fallback ALO.
  • Runtime ABI: expose the full surface needed by codegen (see section 6).

3) Compiled Def Interface (Draft)

typedef Term (*DefFun)(Term *args, u32 argc);
static DefFun DEF_FUN[BOOK_CAP];
static u32    DEF_ARI[BOOK_CAP]; // number of leading LAMs

where:

  • argc is the number of arguments in args.
  • HVM4 does not encode arity in REF, this should be counted from the leading Lams in the definition body; arguments live in APP frames.
  • DEF_ARI[id] = number of leading LAMs in the static book term for def id.

4) WNF REF Branch (Draft)

Pseudocode:

case REF:
  fid = term_ext(next)
  if (DEF_FUN[fid] exists) and (stack has >= DEF_ARI[fid] APP frames):
    read args from APP frames (right-to-left)
    pop frames
    next = DEF_FUN[fid](args, argc)
    goto enter
  else:
    current ALO fallback

This is a saturation check, not a “compiled vs interpreted” branch.
So, the fallback case lets the program continue when the function is
only partially applied (i.e. not all of its args are present).
Fallback: if not saturated or no compiled def, use the existing ALO path
unchanged (partial applications behave as today).

5) Generated C File Contents

Single translation unit containing:

  • typedef for the ABI (see below)
  • all compiled def functions def_<id>(...)
  • void hvm4_register(const Hvm4Api *api, DefFun *out, u32 *ari)
    • stores API pointer
    • installs function pointers and arities
    • the compiled C functions call the hvm4 functions as API->heap_alloc etc

6) Runtime ABI (Full; for Compiled Reductions)

The ABI must expose everything codegen may call when it performs real
reductions. To avoid omissions, define the ABI as “all functions in these
runtime modules” (file path = function name):

  • Heap + substitution: all functions in clang/heap/*.c
    • includes heap_alloc/read/set/take, heap_subst_var, heap_subst_cop
  • Term core + constructors: all functions in clang/term/*.c and
    clang/term/new/*.c
    • includes term_new, term_tag/ext/val/arity, term_sub_get/set,
      and all term_new_* constructors
  • WNF entrypoints: wnf, wnf_pri, wnf_uns
  • Interaction helpers: all wnf_* helpers in clang/wnf/*.c
    • by naming rule, each clang/wnf/foo_bar.c exports wnf_foo_bar
  • Counters/labels: add exported helpers for interaction counting and
    fresh dup labels (e.g., itrs_add(n), fresh())
  • Constants header: tags/op codes/bit masks shared between runtime and
    generated C

Implementation note: because compiled code is in a .so, it must operate on
the main runtime’s globals. That is why we export the runtime surface instead
of embedding hvm4.c in the .so.

Small ABI sample (bootstrap only; not enough for compiled reductions):

typedef struct {
  u64  (*heap_alloc)(u64 size);
  Term (*heap_read)(u32 loc);
  void (*heap_set)(u32 loc, Term t);
  Term (*term_new)(u8 tag, u32 ext, u32 val);
  Term (*term_new_num)(u32 n);
  Term (*term_new_op2)(u32 opr, Term x, Term y);
  u8   (*term_tag)(Term t);
  u32  (*term_val)(Term t);
  u32  (*term_ext)(Term t);
  Term (*wnf)(Term t);
  Term (*wnf_pri)(Term t);
  Term (*wnf_uns)(Term t);
  Term (*wnf_app_lam)(Term lam, Term arg);
} Hvm4Api;

7) Codegen Strategy (Initial)

Start small and correct:

  • Emit C that constructs the def body with args substituted.
  • Only a few core nodes at first (NUM, APP, LAM, OP2, REF, DUP, SUP).
  • No “fast path” optimizations yet; ensure semantic parity.
  • Compiled defs return WNF, not full normalization.

Example (compiled @add = λa. λb. (a + b)):

Term add_f(Term *args, u32 argc) {
  Term a = args[0];
  Term b = args[1];
  if (term_tag(a) == NUM && term_tag(b) == NUM) {
    return term_new_num(term_val(a) + term_val(b));
  }
  return term_new_op2(OP_ADD, a, b);
}

Later:

  • Inline numeric matches and OP2.
  • Tail-call loops for self-recursive defs.
  • Skip node construction when possible.

8) Build/Load Pipeline (Draft)

  • Generate .build/<file>.c.
  • Compile: cc -O2 -fPIC -shared <file>.c -o <file>.so
  • dlopen + dlsym("hvm4_register") + call it.
  • Store DEF_FUN/DEF_ARI in runtime globals.

9) Higher-Order Functions (Current Approach)

Arity will be derived from leading LAMs. That means compiled defs only fire
when saturated. For higher-order results, we can manually structure code to
have the intended arity by using a let to return a lambda. Example:

@f = λa. !v = λb. c; v

This makes @f have arity 1 (one leading LAM), and the compiled def can return
the lambda as a normal term. Later, a small non-looping (non derefing) normalization before
codegen could eliminate the extra app/let overhead if we decide to go that far.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions