ModelCache v0.1 — PVC backend, multi-node

**Design lives in [`design/modelcache/`](https://github.com/modelplaneai/modelplane/blob/dennis/modelcache-design/design/modelcache/design.md) (branch `dennis/modelcache-design`).** This issue tracks v0.1 implementation.

## v0.1 scope

PVC backend, multi-node ready, no dedup. From the [design doc](https://github.com/modelplaneai/modelplane/blob/dennis/modelcache-design/design/modelcache/design.md):

- `ModelCache` CRD with artifact discriminator (`Weights`, `Tokenizer`, `Bytes`)
- Sources: `huggingFace`, `s3`, `http`, `inline`, `configMap`
- `PVC` (RWX) storage backend with one-shot prefetch Job (absorbs [#61](https://github.com/modelplaneai/modelplane/issues/61))
- `replication: AllMatchingClusters` — one RWX PVC per matching cluster, shared across all LWS gang pods
- `clusterSelector.matchLabels` for cluster filtering
- Mount path intrinsic to the cache; deployments reference by name via `caches: [{ name }]`
- Scheduling gated on per-cluster cache `Ready` condition
- Fail-fast when a target cluster has no RWX storage class on `InferenceCluster.spec.storage.storageClassName`
- Pluggable storage backend pattern shared with [#72 KVOffloadTier](https://github.com/modelplaneai/modelplane/issues/72)

## Out of scope (tracked separately)

- `LoraAdapter` / `Engine` artifact kinds → v0.2
- `ContentAddressed` backend (Modal-style tiered cache + lazy loading) → v0.2
- Cross-deployment / cross-tenant dedup → v0.2
- `gcs` / `azure` / `oci` / `pvc-clone` sources → v0.2
- `AllMatchingNodes` replication mode → v0.2
- Substrate unification [#72](https://github.com/modelplaneai/modelplane/issues/72) → v0.3

Roadmap detail in the design doc [§ v0.2](https://github.com/modelplaneai/modelplane/blob/dennis/modelcache-design/design/modelcache/design.md#v02--content-addressed-backend-lazy-loading-full-artifact-taxonomy) and [§ v0.3](https://github.com/modelplaneai/modelplane/blob/dennis/modelcache-design/design/modelcache/design.md#v03--substrate-unification-architectural-option).

## Examples

Nine (ModelCache + ModelDeployment) examples in [`design/modelcache/examples/`](https://github.com/modelplaneai/modelplane/tree/dennis/modelcache-design/design/modelcache/examples): single-cluster basic, multi-node TensorPipeline gang, multi-cluster replication, separate tokenizer, private S3, opaque `Bytes` kind, plus three v0.2 previews.

## References

- [Design doc](https://github.com/modelplaneai/modelplane/blob/dennis/modelcache-design/design/modelcache/design.md)
- [Examples](https://github.com/modelplaneai/modelplane/tree/dennis/modelcache-design/design/modelcache/examples)
- [#61](https://github.com/modelplaneai/modelplane/issues/61) (closed) — RWX PVC mechanism
- [#72](https://github.com/modelplaneai/modelplane/issues/72)
- [PR #75](https://github.com/modelplaneai/modelplane/pull/75) — `engine.env` + `imagePullSecrets`; ModelCache rides on those for credential-bearing sources


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ModelCache v0.1 — PVC backend, multi-node #66

v0.1 scope

Out of scope (tracked separately)

Examples

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

ModelCache v0.1 — PVC backend, multi-node #66

Description

v0.1 scope

Out of scope (tracked separately)

Examples

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions