Curlee is an experimental verification-first programming language and C++23 compiler/runtime.
Curlee is a safety harness for AI-generated (and human-written) code: it refuses to run a program unless it can prove your declared contracts within a small, decidable verification scope.
Modern LLMs can generate a lot of code quickly - but a common failure mode is "almost correct" logic that compiles, runs, and silently does the wrong thing.
Curlee's goal is to be a safety harness:
- You write intent as machine-checkable contracts (
requires/ensures) and refinements (where). - The compiler uses an SMT solver (Z3) to prove obligations.
- If an obligation can't be proven (or the contract is outside the supported logic), Curlee fails the build.
This shifts trust from "I hope the generated code is safe" to "I have a proof (or the program doesn't run)".
AI-generated code is often:
- syntactically valid,
- type-correct,
- but logically wrong in edge cases.
Curlee introduces a new default:
No proof, no run.
Curlee aims to support a world where agents exchange tasks safely.
- An agent can send another agent a bundle (bytecode + metadata + declared capabilities).
- The receiver re-verifies the bundle deterministically before executing.
- Execution is capability-scoped (no ambient authority) and resource-bounded (fuel/gas).
| Theme | Python/JS baseline | Curlee target |
|---|---|---|
| Correctness | Tests + review + runtime errors | Compile-time contract proofs |
| Security | Ambient authority + sandboxing | Capabilities + proofs + fuel |
| AI-generated code | "Probably ok" | "Prove it or reject it" |
| Interop | Big ecosystems | "Shield" legacy ecosystems via explicit unsafe boundaries |
Curlee is structured as a compiler toolchain.
flowchart LR
S[SourceFile] --> L[Lexer]
L --> P[Parser]
P --> R[Resolver]
R --> T[Type Checker]
T --> V[Verifier Z3]
V -->|only after verification succeeds| C[Bytecode Compiler]
C --> M[Deterministic VM fuel bounded]
Example (intended syntax):
fn divide(numerator: Int, denominator: Int) -> Int
[ requires denominator != 0;
ensures result * denominator == numerator; ]
{
return numerator / denominator;
}
The compiler checks obligations like:
- At call sites: prove the callee's
requiresfrom the caller's facts. - At returns: prove the function's
ensures.
The MVP logic fragment is intentionally small and decidable.
This repository is early-stage (alpha research prototype).
Expectations:
- The language and bytecode are not stable yet.
- Diagnostics, CLI output, and tests are expected to evolve.
- Verification is intentionally limited to a small fragment; out-of-scope contracts are rejected.
- If Curlee cannot prove a contract, it will not run the program.
Curlee currently supports two useful workflows:
- MVP-check:
curlee check <file.curlee>runs lex -> parse -> resolve -> type-check -> verify (Z3). If a proof obligation can't be discharged (or is out of scope), Curlee fails with a diagnostic. - MVP-run:
curlee run <file.curlee>(orcurlee <file.curlee>) runscheckfirst, then executes a small verified subset on the deterministic VM (fuel-bounded).
The runnable subset is intentionally small:
- Expressions:
Int/Boolliterals, names,+, grouping. - Statements:
let,return,if/else,while. - Calls: simple no-arg calls to a named function.
Out of scope (for now): strings, general unary/binary ops, function parameters, modules/import execution.
User-facing documentation lives in the GitHub wiki:
- https://github.com/w4ffl35/curlee/wiki
- Supported fragment + stability: https://github.com/w4ffl35/curlee/wiki/Stability-and-Supported-Fragment
tests/correct_samples/is the small, deterministic corpus of verified samples used by tests.training_data.txtis a generated export for downstream RAG/training workflows and is intentionally gitignored.- Regenerate via
python3 scripts/generate_correct_samples.py(writes bothtests/correct_samples/andtraining_data.txt).
- Regenerate via
sudo apt-get update
sudo apt-get install -y cmake ninja-build g++ libz3-dev pkg-configBy default, Curlee uses the system Z3 if available. To force the vendored build:
cmake --preset linux-debug -DCURLEE_USE_SYSTEM_Z3=OFFcmake --preset linux-debugcmake --build --preset linux-debug./build/linux-debug/curlee --help
./build/linux-debug/curlee check examples/mvp_run_int.curlee
./build/linux-debug/curlee run examples/mvp_run_control_flow.curleeFor a quick end-to-end confidence loop (build + basic CLI + proof fixtures + a small targeted test run):
bash scripts/smoke.shYou can also run both debug + release presets:
bash scripts/smoke.sh --bothTo generate a coverage report from unit tests, Curlee provides a coverage preset + helper script.
Dependencies (Ubuntu/Debian):
sudo apt-get update
sudo apt-get install -y gcovrRun:
bash scripts/coverage.shThis will:
- Configure/build/test with the
linux-debug-coveragepreset. - Generate an HTML report at
build/coverage/coverage.html. - Fail the run if line coverage is below the threshold (default: 100%).
Note: the gcovr report excludes throw and unreachable branches by default (so branch coverage isn't dominated by exception edges). You can opt back in with:
bash scripts/coverage.sh --include-throw-branches
bash scripts/coverage.sh --include-unreachable-branchesAdjust threshold or disable failing:
bash scripts/coverage.sh --fail-under 95
bash scripts/coverage.sh --no-failCreate hello.curlee:
fn main() -> Int {
return 1 + 2;
}
Then:
./build/linux-debug/curlee check hello.curlee
./build/linux-debug/curlee hello.curleeThese fixtures are in the repo and should produce a diagnostic:
./build/linux-debug/curlee check tests/fixtures/check_requires_divide.curlee
./build/linux-debug/curlee check tests/fixtures/check_ensures_fail.curleecurlee run is verification-gated, so this should fail with the same diagnostic:
./build/linux-debug/curlee run tests/fixtures/check_ensures_fail.curleeMIT. See LICENSE.
- Curlee is verification-first: unsupported constructs must produce clear errors (no guessing).
- Keep changes small and test-driven.
- Prefer golden tests for diagnostics and verification failures.
In this repo, gh pr edit may fail due to a GraphQL error involving deprecated classic project cards.
Workaround: patch the PR body via the REST API using:
scripts/gh_pr_patch_body.sh <pr-number> <body-file>For agent guidance, see .github/copilot-instructions.md.