ModelOps-Bundle

OCI artifact-based packaging for reproducible simulation model distribution and provenance tracking.

What is ModelOps-Bundle?

ModelOps-Bundle is the reproducibility layer of the ModelOps/Calabaria platform for simulation-based disease modeling. It provides a git-like workflow for packaging, versioning, and distributing simulation models with their code and data dependencies. But, it is intentionally decoupled from Git, since we want finer-scale tracking of model provenance and dependency invalidation.

Key Features:

Scientific Reproducibility: Every simulation run is fully traceable to exact code and data versions
Dependency Tracking: Automatically detects when model dependencies change and invalidates cached results
OCI-Native: Uses industry-standard container registries for distribution (no custom infrastructure)
Model Registry: Tracks registered models and their metadata, traveling WITH the bundle for versioning
Cloud-Agnostic Storage: Supports Azure Blob, AWS S3, GCS, or pure OCI for large data files
Git-Like Workflow: Familiar commands (init, add, push, pull) for scientists already using version control

Why ModelOps-Bundle?

Traditional approaches to distributing simulation models face challenges:

Version Mismatches: Code updates break previously calibrated models
Lost Dependencies: Data files get moved or modified without tracking
Reproducibility Crisis: Can't recreate exact conditions from past experiments
Manual Tracking: Scientists manually track which code version goes with which data

ModelOps-Bundle solves these by:

Content-Addressed Storage: Same content always gets same SHA256 digest
Atomic Bundles: Code + data + metadata travel together as a unit
Automatic Invalidation: When dependencies change, cached results are marked stale
Provenance Chain: Every simulation result links back to exact bundle digest

Installation

ModelOps-Bundle is typically installed as part of the full ModelOps suite, see the directions at ModelOps.

Quick Start

1. Initialize a Project

# Create a new project
mkdir seir-model
cd seir-model
mops bundle init .

# Or initialize existing project
cd my-existing-model
mops bundle init .

This creates:

.modelops-bundle/ - Bundle metadata directory
pyproject.toml - Python project configuration
.modelopsignore - Files to exclude from bundle

2. Register Your Simulation Model

# Register a Starsim model with its data dependencies
mops bundle register-model models/seir.py --class StarsimSIR \
  --data data/demographics.csv \
  --data data/contact_patterns.csv

# Auto-discovers BaseModel subclasses if class not specified
mops bundle register-model models/network_model.py --data data/

Why Register Models?

Enables automatic discovery by execution workers
Tracks which data files each model depends on
Computes content digests for cache invalidation
Your model code stays clean - no decorators or imports needed!

3. Check Bundle Status

mops bundle status

Bundle: modelopsdevacrvsp.azurecr.io/seir-model:latest
Local changes: 3 files modified

Registered Models (2)
─────────────────────────────────────────────────
Model              Status      Dependencies    Cloud
StarsimSIR         ✓ Ready     4 files        Not pushed
NetworkSEIR        ⚠ Stale     2 modified     Not pushed

Run 'mops bundle push' to sync with cloud

4. Push to Registry

# Push to configured registry
mops bundle push

# Or push with specific tag
mops bundle push --tag v1.0.0

Core Concepts

Bundle Structure

A bundle is an OCI artifact containing:

my-model-bundle/
├── .modelops-bundle/
│   ├── registry.yaml      # Model registry (travels with bundle)
│   ├── config.yaml        # Bundle configuration
│   └── manifest.yaml      # Content manifest with digests
├── models/
│   ├── seir.py           # Simulation model code
│   └── network.py        # Alternative model implementation
├── data/
│   ├── demographics.csv  # Input data
│   └── contacts.csv      # Contact matrices
├── targets/              # Calibration targets (optional)
│   └── incidence.py      # Target definitions
└── pyproject.toml        # Python dependencies

Model Registry

The registry (.modelops-bundle/registry.yaml) tracks:

Model entry points (module:Class)
Data dependencies with SHA256 digests
Code dependencies
Model parameters and outputs
Scenarios for each model

This registry is versioned WITH the bundle, ensuring metadata stays synchronized with code.

Content Addressing

Every file gets a SHA256 digest. When ModelOps executes a simulation:

Worker pulls bundle by digest (immutable)
Verifies all file digests match registry
Runs simulation with exact code/data
Results tagged with bundle digest for provenance

Advanced Usage

Working with Large Data Files

For bundles with large data files (>50MB), ModelOps-Bundle automatically uses blob storage:

# .modelops-bundle/config.yaml
storage:
  mode: auto              # auto, blob, or oci
  threshold_bytes: 52428800  # 50MB
  provider: azure         # or s3, gcs
  container: modelops-blobs

Calibration Target Registration

Register calibration targets that define how models compare to observed data:

# Register target functions
mops bundle register-target targets/incidence.py

# Targets use Calabaria decorators but are tracked by bundle
# for reproducibility

# Group targets into reusable sets for calibration jobs
mops bundle target-set set incidence \
  --target incidence_per_replicate_target \
  --target incidence_replicate_mean_target

# Inspect available sets
mops bundle target-set list

Bundle Comparison

Compare local changes with registry:

# Show what would be pushed
mops bundle diff

# Compare with specific version
mops bundle diff --ref v1.0.0

# Show file-level changes
mops bundle status --files

Pulling Remote Bundles

# Pull latest (won't overwrite local changes)
mops bundle pull

# Force overwrite local changes
mops bundle pull --overwrite

# Pull specific version
mops bundle pull --ref sha256:abc123...

Integration with ModelOps Workflow

ModelOps-Bundle integrates seamlessly with the full platform:

Development: Scientists develop models locally
Registration: Models registered with mops bundle register-model
Pushing: Bundle pushed to registry with mops bundle push
Study Design: Calabaria creates parameter sweeps referencing bundle models
Execution: Workers pull bundle and discover models via registry
Provenance: Results tagged with bundle digest for reproducibility

Monitoring Job Execution

After submitting jobs with mops jobs submit, you can monitor the Dask cluster executing your bundled models:

# Port-forward to access Dask dashboard (run in separate terminals or use &)
kubectl port-forward -n modelops-dask-dev svc/dask-scheduler 8787:8787 &
kubectl port-forward -n modelops-dask-dev svc/dask-scheduler 8786:8786 &

# Access Dask dashboard at http://localhost:8787
# Workers connect via port 8786

This lets you monitor task progress, worker utilization, and debug any issues with your bundle execution in real-time.

Running Integration Tests Locally

Some of the pytest suites (the ones marked -m integration) drive the CLI end-to-end against a live registry. You need the dev stack running first:

cd modelops-bundle
make start   # starts registry + Azurite and seeds .modelops-bundle/envs/local.yaml
uv run python -m pytest -m integration tests/test_e2e.py::test_full_workflow_with_cli_commands

make start is also safe to run on CI agents—the target no-ops if the stack is already up. When you are done debugging, make stop tears everything down.

Lower-Level Bundle Operations

For fine-grained control over bundle contents:

Add Files

# Add specific files
mops bundle add src/utils.py config/settings.yaml

# Add directories recursively
mops bundle add src/ data/

# Add everything (respects .modelopsignore)
mops bundle add .

Remove Files

# Stop tracking files (keeps on disk)
mops bundle remove src/old_model.py

# Untrack AND delete files
mops bundle remove --rm tmp/

File Status

# Show all tracked files
mops bundle status --files

# Show only untracked files
mops bundle status --untracked-only

Development

For development and testing:

# Clone the repository
git clone https://github.com/institutefordiseasemodeling/modelops-bundle.git
cd modelops-bundle

# Install in development mode
uv pip install -e .

# Start local registry for testing
cd dev
docker compose up -d

# Run tests
uv run pytest

# Run with local registry
export REGISTRY_URL=localhost:5555
export MODELOPS_BUNDLE_INSECURE=true
mops bundle push

Environment Configuration

ModelOps-Bundle uses environment configurations from ~/.modelops/bundle-env/ which are automatically created when you provision ModelOps infrastructure with mops infra up.

These YAML files contain your registry and storage settings:

# ~/.modelops/bundle-env/dev.yaml
environment: dev
registry:
  provider: docker
  login_server: modelopsdevacr.azurecr.io
storage:
  provider: azure
  container: bundle-blobs
  connection_string: "DefaultEndpointsProtocol=..."

When you run bundle commands, the appropriate environment is loaded automatically:

Projects are initialized with an environment (e.g., mops bundle init --env dev)
The environment is pinned in .modelops-bundle/env
Credentials are loaded from the environment file when needed

For local development/testing, you can create a local.yaml environment that uses a local Docker registry (see Development section).

Related Projects

modelops - Infrastructure orchestration
modelops-contracts - API contracts
modelops-calabaria - Science framework

License

MIT

Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Documentation

All bundle-specific design notes now live under docs/. Start with docs/index.md for the curated list of active guides (cache locking, provenance, ORAS) and the archived implementation plans. Auto code-discovery and bundle config options are documented in docs/auto_code.md.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
dev		dev
docs		docs
src/modelops_bundle		src/modelops_bundle
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

InstituteforDiseaseModeling/modelops-bundle

Folders and files

Latest commit

History

Repository files navigation