[codex] unify cache capacity config by easel · Pull Request #381 · Luce-Org/lucebox-hub

easel · 2026-06-14T14:03:24Z

Summary

This is the org-visible stacked PR for the unified cache config model. It depends on #380.

Because the #380 head branch currently lives in the easel fork and this account cannot push branches to Luce-Org/lucebox-hub, this PR is temporarily opened against main. That makes it visible to the team, but the GitHub compare includes #380 plus the unified-cache commit. Once #380 lands, this PR should be rebased/retargeted to main so the visible diff collapses to the unified-cache work only.

What changed

Replaces cache slot sizing with byte-sized RAM/disk budget flags for prefix and prefill caches.
Keeps legacy slot flags as compatibility aliases.
Adds disk-backed exact prefill cache support alongside existing prefix disk cache support.
Exposes unified cache budget/usage telemetry in /props.
Updates docs, OpenAPI props, entrypoint env vars, scripts, and cache proof tests.

Validation

cmake --build server/build --target test_server_unit dflash_server -j$(nproc)
server/build/test_server_unit (1959 assertions, 0 failures)
python3 -m py_compile for changed Python scripts
bash -n server/scripts/entrypoint.sh
git diff --check
DFLASH_SERVER_BIN=server/build/dflash_server python3 server/scripts/test_prefill_cache.py
- RAM prefill cache active with 2 hits and ~10427x lower-bound warm speedup
DFLASH_SERVER_BIN=server/build/dflash_server python3 server/scripts/test_prefill_disk_cache.py
- disk prefill cache active with 2 hits and ~10623x lower-bound warm speedup

Notes

The cleaner branch layout would be base Luce-Org:codex/prefill-cache-wiring and head Luce-Org:codex/unified-cache-config, but pushing those branches requires org write permission.

easel · 2026-06-14T14:13:03Z

Claude Code reviewed the stacked diff from #380 head to this branch. It found stale /props contract docs and an OpenAPI prefix_cache example mismatch, plus a note that legacy full_cache.enabled is RAM-only in disk-only prefill mode. Addressed those in 5a321eb (docs: align props cache contract). No runtime correctness findings were reported in that review.

easel · 2026-06-14T14:57:30Z

Follow-up pushed in 7568ed2 to make the unified cache model the primary user surface. Defaults now use --cache-ram 1GiB split as 256MiB prefix + 768MiB exact prefill, and --cache-disk 16GiB split as 4GiB prefix + 12GiB exact prefill when a cache dir is configured. The server also alternates cold-miss snapshot targets when both RAM pools are viable so exact repeated prompts and multi-turn prefix reuse can both populate without user tuning.\n\nValidation after this follow-up:\n- cmake build: test_server_unit + dflash_server\n- server/build/test_server_unit: 1978 assertions, 0 failures\n- py_compile for touched Python scripts\n- bash -n server/scripts/entrypoint.sh\n- OpenAPI YAML parse/cache example assertion\n- git diff --check\n- RAM exact-prefill proof: 1 commit, 2 hits, warm prefill rounded to 0.000s\n- Disk exact-prefill proof: RAM off, 1 disk save, 2 disk hits\n\nClaude Code reviewed this follow-up diff and reported no actionable correctness bugs. It flagged only a cosmetic /props example indentation issue, fixed before this commit.

easel mentioned this pull request Jun 14, 2026

[codex] Unify cache capacity config easel/lucebox-hub#4

Closed

easel added 7 commits June 14, 2026 22:59

wire prefill cache slots

62ea1f0

test(server): validate prefill proof binary

2ac725b

ci: serialize self-hosted GPU jobs

1f0ca42

feat(server): unify cache capacity config

a1b7a78

docs: align props cache contract

ea1f300

feat(server): default unified cache budgets

6ab8b98

feat(server): surface unified cache status

93b1522

easel force-pushed the codex/unified-cache-config branch from 4b30337 to 93b1522 Compare June 15, 2026 03:03

easel added 3 commits June 14, 2026 23:14

docs(server): fix prefill skip park cache flag

6cfdd47

fix(server): tighten unified cache config compatibility

813937d

test(server): allow kvflash smoke without indexer drafter

26d8f99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] unify cache capacity config#381

[codex] unify cache capacity config#381
easel wants to merge 10 commits into
Luce-Org:mainfrom
easel:codex/unified-cache-config

easel commented Jun 14, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

easel commented Jun 14, 2026

Uh oh!

easel commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

easel commented Jun 14, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Validation

Notes

Uh oh!

easel commented Jun 14, 2026

Uh oh!

easel commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

easel commented Jun 14, 2026 •

edited by cubic-dev-ai Bot

Loading