Julia implementation of reversible jump MCMC survival models for simulations and real-world analyses (German Breast Cancer and Primary Biliary Cirrhosis).
- Three RJMCMC Methods:
- NonLinear1: Default uniform order statistics prior
- NonLinear2: Dirichlet-Gamma prior for knot locations
- CoxPH: Baseline Cox proportional hazards model
- Parallel Execution: Thread-based parallelism with ordered execution (by g_type, then n_values)
- Resume Support: Automatic checkpointing and resume from existing results
- Progress Tracking: Real-time progress bar for long-running simulations
- Comprehensive Outputs: Posterior summaries, IBS metrics, and publication-ready plots
config.jl: Central defaults for simulations (demo/full) and real-data runsmodel.jl: RJMCMC algorithms and utilities shared by all scriptssimulation.jl: End-to-end simulation pipeline with caching, resume support, progress bar, and thread-based parallelismreal_data_GBC.jl/real_data_PBC.jl: Analyses for the GBC and PBC datasets (CSV files in the project root)install_packages.jl: Script to install all required Julia packagestable_gen.py: Generate LaTeX simulation tables fromsimu_summary.csvboxplot.R: R script for generating IBS boxplots (requires ggplot2)results/: Generated outputs (gitignored)
- Julia >= 1.9: Download from julialang.org
- R (optional, for boxplot generation): Download from r-project.org
Recommended method (easiest):
julia install_packages.jlAlternative methods:
-
Windows PowerShell:
julia -e "using Pkg; Pkg.add([\"DataFrames\", \"CSV\", \"Distributions\", \"ProgressMeter\", \"StatsBase\", \"VectorizedStatistics\", \"MLDataUtils\", \"Plots\", \"StatsPlots\", \"CategoricalArrays\", \"SpecialFunctions\", \"LaTeXStrings\"])"
-
Interactive Julia REPL:
julia
] add DataFrames CSV Distributions ProgressMeter StatsBase VectorizedStatistics MLDataUtils Plots StatsPlots CategoricalArrays SpecialFunctions LaTeXStrings(Press Ctrl+C to exit package mode, then type exit() to quit)
Rscript -e "install.packages('ggplot2', repos='https://cran.rstudio.com/')"Or in R console:
install.packages("ggplot2")# Quick demo run (all scenarios: n=200,400,800, 5 replications each)
julia simulation.jl --demo
# Full run (all scenarios: n=200,400,800, 1000 replications each)
julia simulation.jl --full
# Use specific number of threads
JULIA_NUM_THREADS=32 julia simulation.jl --full
# Windows PowerShell
$env:JULIA_NUM_THREADS=32; julia simulation.jl --full--demo: Run demo mode (all scenarios with 5 replications each)--full: Run full mode (all scenarios with 1000 replications each)--reset: Clear existing checkpoints and rerun from scratch--replot: Regenerate plots from existing results--plot-only: Only generate plots without running simulations--workers=N: Specify number of worker threads (default: all available threads)
The simulation executes tasks in a specific order:
- By g_type:
linear->quad->sin - By n_values: Within each g_type, tasks are processed by sample size (200 -> 400 -> 800)
- By replication: Within each (g_type, n) combination, replications are executed in parallel but tasks are sorted by replication index (1, 2, ..., N)
This ensures that all linear scenarios complete before starting quad, and all quad complete before starting sin.
Demo Mode (--demo):
- Sample sizes:
[200, 400, 800] - Replications:
5per scenario - Total tasks: 3 g_types x 3 n_values x 5 replications = 45 tasks
Full Mode (--full):
- Sample sizes:
[200, 400, 800] - Replications:
1000per scenario - Total tasks: 3 g_types x 3 n_values x 1000 replications = 9000 tasks
- Checkpoints are saved under
results/simulation/<mode>/g=<type>/n=<size>/rep=<id>/results_dict.jls - Rerunning will automatically skip finished tasks and continue from existing checkpoints
- Use
--resetto clear all checkpoints and start fresh
Results are saved in results/simulation/<mode>/:
simu_summary.csv: Summary statistics for all methodsdf_IBS.csv: Integrated Brier Score (IBS) metricsplots/: Individual plots for each (g_type, n) combinationplots_manuscript/: Publication-ready figures (Figures 1-4 style)
# German Breast Cancer dataset
julia real_data_GBC.jl
# Primary Biliary Cirrhosis dataset
julia real_data_PBC.jlOutputs are written to results/real_data/gbc/ and results/real_data/pbc/:
beta_*.csv: Posterior summaries for NonLinear1, NonLinear2, and CoxPH (columns:Method,Beta,Pos_Mean,CrI_LB,CrI_UB)g_*.csv: Posterior mean of g(z) for NonLinear1 and NonLinear2 (columns:z,g_non1,g_non2)tau_*.csv,zeta_*.csv,HK_*.csv: Trace storage for tau, zeta, H, and K (NonLinear1 and NonLinear2)IBS_*.csv: Cross-validated IBS comparisons across methodsg_compare_*.pdf: NonLinear1 vs NonLinear2 g(z) plot with labeled axesz_hist_*.pdf: Histogram of Z values used in the nonlinear term
See requirements.txt for a complete list of required packages:
- Julia packages: DataFrames, CSV, Distributions, ProgressMeter, StatsBase, VectorizedStatistics, MLDataUtils, Plots, StatsPlots, CategoricalArrays, SpecialFunctions, LaTeXStrings
- R packages: ggplot2 (for boxplot generation)
- Python packages: pandas (for
table_gen.py)
- The simulation uses thread-based parallelism. Set
JULIA_NUM_THREADSenvironment variable to control the number of threads. - On Windows PowerShell, use
$env:JULIA_NUM_THREADS=Nto set thread count. - The code automatically detects available threads and warns if requested threads exceed available threads.
- For best performance, set
JULIA_NUM_THREADSto match your CPU core count.
If you have questions or run into issues, please contact zhanght@gdou.edu.cn.