Skip to content

andringodson/RaphaVision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

RaphaVision

RaphaVision

Epidemic intelligence platform for infectious-disease modelling, forecasting, and live surveillance.

Compartmental models (SIR · SEIR · SEIRD · SEIRDV) with real-world upgrades —
time-varying transmission, spatial metapopulation coupling, age stratification,
live data ingestion, and forecasts validated with the same scoring the CDC Forecast Hub uses.

Live demo Python Flask Deploy License

Live demo →


Overview

RaphaVision turns classical epidemic theory into an interactive, quantitative tool. Where most SIR demos stop at a textbook curve, RaphaVision layers on the mechanics that matter in practice: transmission that changes over time, geography that couples outbreaks between regions, contact structure that varies by age, and — critically — honest uncertainty. Every forecast is scored against held-out history, and every number in the interface is labelled by how much you should trust it.

It runs as a single-page dashboard backed by a Python API, and ships as one serverless function on Vercel.

⚕️ Disclaimer — RaphaVision is an educational and research tool. Its outputs must not be used to inform real public-health decisions without review by a qualified epidemiologist.


Capabilities

Modelling

  • Compartmental solvers — SIR, SEIR, SEIRD, SEIRDV — via RK4 with an Euler reference, plus closed-form peak analytics.
  • Time-varying transmission: rolling R(t) → β(t) estimation feeding a Monte-Carlo fan chart.
  • Spatial metapopulation: coupled multi-patch SIR with gravity-model contact coupling (Viboud 2006).
  • Age stratification: POLYMOD contact matrices (Mossong 2008) with age-specific IFR / hospitalisation (Verity 2020).

Forecasting & validation

  • Monte-Carlo probabilistic forecasts with 50% and 90% prediction intervals.
  • Rolling backtests scored by the Weighted Interval Score (Bracher 2021) — the metric used by the CDC Forecast Hub.
  • ML peak-predictor surrogate with a dual-band confidence interval (parametric + held-out model error).

Live surveillance

  • Live global and per-country figures from disease.sh, with data_as_of and staleness_hours provenance on every pull.
  • Historical calibration against the Our World in Data COVID-19 dataset.
  • Interactive world choropleth, country comparison, R(t) tracking, hospital / ICU capacity projection, and a composite country risk score.

Trust by design

  • Every displayed value is tagged Raw Data, Model Estimate, or Backtested Forecast so model output is never mistaken for measurement.

Tech stack

Layer Choice
API Python 3.12 · Flask · NumPy (SciPy / scikit-learn for local training)
Frontend Vanilla JS single-page app · Chart.js · Leaflet
Hosting Vercel serverless (one Python function serves API + static site)
Data disease.sh (live) · Our World in Data (historical)

Quickstart

cd disease-simulator

# 1. Install dependencies
pip install -r requirements.txt

# 2. (Optional) train the ML peak predictor — writes models/peak_predictor.pkl
python backend/train_model.py

# 3. (Optional) cache the OWID historical dataset for backtesting
python backend/fetch_owid.py

# 4. Run — Flask serves both the API and the dashboard
python backend/app.py

Then open http://localhost:5000/.

The peak predictor is optional: without scikit-learn or the trained pickle, /predict_peak falls back to the exact peak from a direct RK4 simulation.


Deployment

Hosted on Vercel as a single Python serverless function — api/index.py re-exports the Flask app and vercel.json routes all traffic to it. The Vercel project's root directory is disease-simulator/.

cd disease-simulator
vercel deploy --prod

Lean serverless bundle. api/requirements.txt intentionally omits SciPy and scikit-learn to stay well under Vercel's 250 MB unzipped limit:

  • scipy.optimize.curve_fit → a dependency-free NumPy coarse-to-fine fit (ai_models._fit_two_param_lsq).
  • The scikit-learn peak surrogate is optional; /predict_peak computes the exact peak from simulation when it is absent. The root requirements.txt keeps both libraries for local training.

API reference

Base URL: / (same origin as the dashboard).

Simulation

Method Endpoint Description
POST /simulate SIR simulation (RK4 + Euler), R₀, peak, herd threshold
POST /simulate_seir SEIR / SEIRD / SEIRDV simulation
POST /simulate_spatial Coupled multi-patch SIR (gravity coupling)
POST /simulate_age Age-structured SIR (POLYMOD contacts)
POST /fit Fit β/γ to an observed case series
POST /intervene Mid-simulation β change
POST /sensitivity Parameter sensitivity sweep

Forecasting & analytics

Method Endpoint Description
POST /predict_peak Peak estimate; with_ci=true returns a dual-band CI
POST /forecast Monte-Carlo fan chart (50% + 90% PI)
POST /forecast_tv Time-varying β(t) Monte-Carlo forecast
POST /backtest Rolling WIS scorecard vs. historical data
POST /rt_tracker R(t) from a case time-series
POST /growth_metrics Doubling time / effective R(t)
POST /hospital Hospital / ICU capacity projection
POST /scenarios Parallel scenario comparison (up to 6)
POST /vaccination_impact Vaccination-speed comparison
POST /recommend Rule-based intervention recommender

Live & reference data

Method Endpoint Description
GET /live/global Global summary (disease.sh)
GET /live/countries Per-country figures for the world map
GET /live/country/<name> Single-country live + history
GET /live/compare Multi-country historical comparison
GET /live/continents Continent breakdown
GET /live/risk Composite country risk scores
GET /countries · /presets Country list · disease parameter presets
GET /data/status Live + OWID cache provenance
POST /data/owid/fetch Trigger an OWID cache refresh
GET /healthz Health check

Project structure

RaphaVision/
└── disease-simulator/
    ├── api/index.py              # Vercel serverless entrypoint (re-exports Flask app)
    ├── vercel.json               # Routes all traffic to the function
    ├── backend/
    │   ├── app.py                # Flask API + static frontend serving
    │   ├── sir_solver.py         # Numerical solvers (RK4, Euler, TV, analytical)
    │   ├── extended_models.py    # SEIR / SEIRD / SEIRDV
    │   ├── tv_param_estimator.py # Time-varying β pipeline
    │   ├── spatial_model.py      # Metapopulation SIR
    │   ├── age_model.py          # Age-structured SIR (POLYMOD + Verity)
    │   ├── ai_models.py          # Parameter fitting + peak predictor + dual-band CI
    │   ├── forecaster.py         # Monte-Carlo fan chart
    │   ├── backtester.py         # Rolling WIS backtest
    │   ├── rt_tracker.py         # R(t) estimator
    │   ├── hospital_model.py     # Hospital / ICU projections
    │   ├── risk_scorer.py        # Composite risk scoring
    │   ├── scenario_engine.py    # Scenario comparison
    │   ├── live_data.py          # disease.sh pipeline + staleness
    │   ├── fetch_owid.py         # OWID downloader + cache
    │   └── train_model.py        # ML training script (run once)
    ├── frontend/                 # index.html · main.js · style.css
    ├── models/                   # peak_predictor.pkl (bundle: model + MAE + version)
    └── data/                     # bundled fallback dataset + disease presets

Detailed module docs live in disease-simulator/README.md.


Modelling notes & limitations

  • Constant population — vital dynamics (non-disease births/deaths) and migration are not modelled in the base solvers.
  • Homogeneous mixing — the base SIR assumes uniform contact; the spatial and age models partially relax this.
  • Fitting targetfit calibrates against a prevalent I(t) curve, not daily incidence; real reported data (daily new cases) needs a conversion step.
  • Data currency — the OWID dataset was frozen in August 2024 (ideal for backtesting historical windows); live figures come from disease.sh.
  • Backtest scope — rolling WIS windows test forecast skill on historical SIR-like dynamics and do not capture behavioural change, variants, or reporting artefacts.

References

  • Bracher et al. (2021). Evaluating epidemic forecasts in an interval format. PLoS Comput Biol. 10.1371/journal.pcbi.1008618
  • Cori et al. (2013). A new framework to estimate time-varying reproduction numbers. Am J Epidemiol. 10.1093/aje/kwt133
  • Mossong et al. (2008). Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS Med. 10.1371/journal.pmed.0050074
  • Verity et al. (2020). Estimates of the severity of coronavirus disease 2019. Lancet Infect Dis. 10.1016/S1473-3099(20)30243-7
  • Viboud et al. (2006). Synchrony, waves, and spatial hierarchies in the spread of influenza. Science. 10.1126/science.1125237

License

Released under the MIT License — see LICENSE.

About

Epidemic intelligence platform — compartmental disease models (SIR · SEIR · SEIRD · SEIRDV), live surveillance, and CDC-standard (WIS) validated forecasting, with per-figure uncertainty labelling.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors