Important
This repo is no longer maintained and its contents have been superseded by functions in https://github.com/mkoslovsky/MicroBVS
Bayesian variable selection for Dirichlet-Multinomial regression. For
Wadsworth, W. D., Argiento, R., Guindani, M., Galloway-Pena, J., Samuel, S. A., & Vannucci, M. (2016). An Integrative Bayesian Dirichlet-Multinomial Regression Model for the Analysis of Taxonomic Abundances in Microbiome data.
The code has been updated on 10/02/17 with an option to perform a stochastic search algorithm for Bayesian variable selection instead of the Gibbs sampler. The stochastic search variable selection approach provides gains in computational speed.
Updated on 07/28/22: The code is no longer actively maintained and has been superseded by functions in https://github.com/mkoslovsky/MicroBVS (see vignettes).
The repository contains
-
C code implementing an MCMC sampler for Dirichlet-Multinomial Bayesian Variable Selection (dmbvs) using spike-and-slab priors,
-
R code to wrap and run the sampler from within R,
-
a function for simulating data,
-
and a "start-to-finish" script (
example_analysis_script.R
) demonstrating usage of the code. This script gives reasonable default settings for the hyperparameters and MCMC parameters for the example simulated data. Settings may change for other data.
-
The R code requires the
dirmult
andMASS
packages. Please ensure those are installed first. -
The C code relies on the GNU Scientific Library (GSL). GSL must be installed and modifications may need to be made to the 'library' and 'include' paths in the Makefile.
-
The main C file (
dmbvs.c
), as well as the Makefile to be used for compilation of the code, can be found under the directorycode/
. -
On Linux and Mac the C code may be compiled with the Makefile from R using:
# must be in package's root directory
setwd("dmbvs")
system("cd code; make")
- If compilation has been successful there will be an executable called
dmbvs.x
thecode/
directory. Data may be simulated and the MCMC code run using:
source(file.path("code", "wrapper.R"))
source(file.path("code", "helper_functions.R"))
simdata = simulate_dirichlet_multinomial_regression(n_obs = 100, n_vars = 50,
n_taxa = 50, n_relevant_vars = 5,
n_relevant_taxa = 5)
results = dmbvs(XX = simdata$XX[,-1], YY = simdata$YY,
intercept_variance = 10, slab_variance = 10,
bb_alpha = 0.02, bb_beta = 1.98, GG = 1100L, thin = 10L, burn = 100L,
exec = file.path(".", "code", "dmbvs.x"), output_location = ".")
- Tested on:
- Mac with R version 3.2.2 and GSL version 2.1 with the Apple LLVM version 7.0.2 compiler
- Red Hat Linux 6.6 with R version 3.1.2 and GSL version 2.1 and gcc version 5.2.0