LLMCC: LLM-Native Compiler Toolchain

A concrete implementation of the "LLM ≈ probabilistic compiler" concept, featuring spec-first artifacts, constrained decoding, and deterministic verification.

Design Philosophy

LLMCC treats LLMs as probabilistic compilers by providing:

Spec-first artifacts - JSON/YAML + EBNF/JSON-Schema specifications
Tight output contracts - Grammars and types for constrained decoding
Dense examples - Positive/negative pairs near API surface
Stable handles - Names, layouts, and error codes that never drift
Deterministic gates - Property tests + oracles for output validation

Quick Start

# Install dependencies
npm install

# Copy environment template and configure your API key
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

# Compile with original implementation
OPENAI_API_KEY=sk-your-key npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1

# Compile with ax framework (recommended)
OPENAI_API_KEY=sk-your-key npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1 --use-ax

# Use different models with ax (supports OpenAI, Anthropic, Google)
OPENAI_API_KEY=sk-your-key npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1 --use-ax --model gpt-4o

# Test against examples
npx ts-node tools/llmcc.ts test src/example.ts

# Generate specification hash for versioning
npx ts-node tools/llmcc.ts spec-hash src/example.ts --fn slugifyTitle

# Run property tests
npm test

# Run full demo
make demo

Ax Framework Integration ⚡

LLMCC now supports the ax framework for enhanced LLM operations:

Benefits

Multi-provider support: Easy switching between OpenAI, Anthropic, Google
Declarative programming: Simplified LLM interaction patterns
Enhanced reliability: Better error handling and fallbacks
Future-ready: Built-in support for streaming, optimization, agents

Usage

# Use ax framework (add --use-ax flag)
npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1 --use-ax

# Works with all existing options
npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1 --use-ax --model gpt-4o --temperature 0.1

Both implementations produce identical results, but ax provides a more robust and extensible foundation.

📚 See docs/ax-integration.md for detailed comparison and examples

Contract Annotations

Functions are annotated with structured contract blocks that LLMs can parse:

/**
 * @llm.contract v1
 * name: slugifyTitle
 * intent: Convert an arbitrary title into a web-safe slug.
 * input_schema: ref://contracts/slugify.input.schema.json
 * output_schema: ref://contracts/slugify.output.schema.json
 * invariants:
 *  - output.length <= 80
 *  - /^[a-z0-9-]+$/.test(output)
 * error_codes:
 *  - SLUG_TOO_LONG
 *  - SLUG_INVALID_CHAR
 */
export function slugifyTitle(input: string): string {
  // Implementation with proper error handling
}

Architecture

Pipeline

spec-collect → Gather contract blocks + schemas + examples
plan → Model proposes structured JSON plan
synthesize → Constrained decoding to types/grammar
verify → Type-check + JSON-Schema + property tests
repair → Limited repair attempts using repair playbook
admit → Commit code with metadata logging

Directory Structure

llmcc/
├── contracts/           # JSON schemas and EBNF grammars
│   ├── slugify.input.schema.json
│   ├── slugify.output.schema.json
│   ├── slugify.repairs.yaml
│   └── query.ebnf
├── examples/           # Positive/negative test cases
│   └── slugify.examples.jsonl
├── src/               # Implementation with contract annotations
│   └── example.ts
├── tools/             # Compiler toolchain
│   ├── llmcc.ts      # CLI driver
│   ├── validators.ts  # Schema validation
│   └── decode.ts     # Constrained decoding
├── tests/             # Property tests
│   └── slugify.property.test.ts
└── Makefile           # Build targets

Key Features

Constrained Decoding

const result = await constrainedDecode(prompt, {
  outputSchemaPath: "contracts/slugify.output.schema.json",
  maxRepairs: 3,
  temperature: 0.2,
  contractMetadata: contract
});

Automatic Repair

The system attempts bounded repairs when validation fails:

Schema violations → Pattern-based fixes
Invariant violations → Property-preserving transformations
Length violations → Smart truncation
Character violations → Safe substitution

Metadata Tracking

Every compilation produces traceable metadata:

{
  "artifact": "slugifyTitle",
  "spec": "v1", 
  "spec_hash": "c66a00e5",
  "model": "gpt-4",
  "decode": {"mode": "json+grammar", "temperature": 0.2},
  "verify": {"schema_pass": true, "tests_pass": true},
  "repairs": 0,
  "latency_ms": 12
}

Examples

Compilation

$ OPENAI_API_KEY=sk-... npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1
🔧 Compiling slugifyTitle from src/example.ts...
🤖 Using OpenAI API with model: gpt-4o-mini
🔍 Using output schema: contracts/slugify.output.schema.json
✅ Compilation completed in 847ms
🔧 Model: gpt-4o-mini (temp: 0.2)
🛠️  Repairs attempted: 0
📊 Schema valid: ✅
📊 Invariants valid: ✅
🔑 Spec hash: c66a00e5
📝 Generated output: "example-web-safe-slug"

Property Testing

$ npm test
✓ tests/slugify.property.test.ts  (13 tests) 8ms
  ✓ length and charset invariant
  ✓ idempotency
  ✓ empty and whitespace handling
  ✓ deterministic output
  ✓ unicode normalization

Example Validation

$ npx ts-node tools/llmcc.ts test src/example.ts
🧪 Testing examples for src/example.ts...
✅ "Hello World" → "hello-world"
✅ "JavaScript & TypeScript" → "javascript-typescript" 
✅ "" correctly failed: empty input
📊 Test Results: 12/14 passed

OpenAI Integration

LLMCC now includes full OpenAI API integration with structured outputs and JSON schema validation.

Setup

Get OpenAI API Key: Sign up at https://platform.openai.com/ and create an API key

Configure Environment:

cp .env.example .env
# Edit .env and set OPENAI_API_KEY=sk-your-key-here

Run with Real Models:

npx ts-node tools/llmcc.ts compile src/example.ts --fn slugifyTitle --spec v1

Structured Outputs

The system uses OpenAI's Structured Outputs feature with response_format: json_schema and strict: true:

const completion = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'system', content: 'Generate output that adheres to the JSON schema...' },
    { role: 'user', content: prompt }
  ],
  temperature: 0.2,
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'synthesized_output',
      strict: true,
      schema: {
        type: 'object',
        properties: { result: outputSchema },
        required: ['result'],
        additionalProperties: false
      }
    }
  }
});

Model Support

gpt-4o-mini (default) - Fast, cost-effective, supports structured outputs
gpt-4o - More powerful, higher cost
gpt-4-turbo - Legacy model with good performance
gpt-3.5-turbo - Budget option (check structured output support)

Cost Optimization

Uses gpt-4o-mini by default (20x cheaper than GPT-4)
Low temperature (0.2) for deterministic outputs
Efficient prompting with structured JSON schema
Bounded repair attempts (max 3) to limit API calls

Add Custom Contracts

Create function with @llm.contract annotation
Add corresponding JSON schemas in contracts/
Create examples in examples/
Add property tests in tests/

Command Line Options

# Basic usage
llmcc compile <file> --fn <function-name> --spec <version>

# With API key
llmcc --api-key sk-your-key compile src/example.ts --fn slugifyTitle --spec v1

# Different models
llmcc compile src/example.ts --fn slugifyTitle --spec v1 --model gpt-4o
llmcc compile src/example.ts --fn slugifyTitle --spec v1 --model gpt-4o-mini

# Adjust temperature and repairs
llmcc compile src/example.ts --fn slugifyTitle --spec v1 --temperature 0.1 --max-repairs 5

# Use ax framework
llmcc compile src/example.ts --fn slugifyTitle --spec v1 --use-ax

# Verbose output
llmcc compile src/example.ts --fn slugifyTitle --spec v1 --verbose

# Test examples
llmcc test src/example.ts
llmcc test src/example.ts --examples custom/examples.jsonl

# Generate spec hash
llmcc spec-hash src/example.ts --fn slugifyTitle

CI Integration

- name: Setup OpenAI API Key
  env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
    
- name: Validate Contracts
  run: make validate
  
- name: Test Examples  
  run: make test-examples
  
- name: Check Spec Drift
  run: make spec-diff

Benefits

Reproducible - Spec hashes track contract changes
Testable - Property tests + example validation
Repairable - Bounded repair attempts for common errors
Traceable - Full metadata logging for debugging
Scalable - Contract-first development for teams

Documentation

📚 Demo Guide - Complete walkthrough with examples
⚡ Ax Integration - Enhanced LLM operations guide

Future Work

AST-based contract parsing (replace regex)
Grammar-constrained generation (EBNF → regex)
Multi-turn repair conversations
Formal verification integration
Cross-language contract support
Enhanced ax features (streaming, agents, RAG)

License

MIT License - See LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
contracts		contracts
docs		docs
examples		examples
src		src
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

License

evalops/llmcc

Folders and files

Latest commit

History

Repository files navigation

LLMCC: LLM-Native Compiler Toolchain

Design Philosophy

Quick Start

Ax Framework Integration ⚡

Benefits

Usage

Contract Annotations

Architecture

Pipeline

Directory Structure

Key Features

Constrained Decoding

Automatic Repair

Metadata Tracking

Examples

Compilation

Property Testing

Example Validation

OpenAI Integration

Setup

Structured Outputs

Model Support

Cost Optimization

Add Custom Contracts

Command Line Options

CI Integration

Benefits

Documentation

Future Work

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages