MCANOVA Package

The MC-ANOVA R package provides:

MC_ANOVA: A function to estimate the Relative Accuracy (RA) of cross-ancestry prediction for short chromosome segments.
Maps of the Relative Accuracy of European-derived local genomic scores for African, Caribbean, East Asian, and South Asian ancestry groups.
A Shiny App providing a graphical interface to the Relative Accuracy maps. (Note, a slower but URL-acccessible version of the app can be found here)
Tools that, together with MC_ANOVA() can be used develop Relative Accuracy maps.
Link to manuscript

Installation

To install the development version from GitHub, first install remotes:

 install.packages("remotes")
 library(remotes)

Then install the package:

 # detach("package:MCANOVA", unload = TRUE)
 install_github("lupiA/MCANOVA")
 library(MCANOVA)

This should take about one minute for a typical computer when including installation of all non-base dependencies.

This package should be compatible with Windows, Mac, and Linux operating systems and has been tested on Windows 7 & 10, macOS Monterey, and Linux CentOS 7.

Examples

Loading Relative Accuracy maps in an R session.
Shiny App: Launches a Shiny App for the RA Maps we developed using UK Biobank data.
Segments: Finds disjoint chromosome segments.
MCANOVA: Estimate Within- and Cross-ancestry R-squared.

Loading the Relative Accuracy maps into an R session

 library(MCANOVA)

# Relative Accuracy Map (UK Biobank arrays)
 data(MAP_UKB)

Back

Launching the Shiny App

 # library(ggplot2)
 # library(shiny)

 library(MCANOVA)
 PGS_portability_app()

Back

Creating chromosome segments of a minimum base pair length and size (# of SNPs).

 library(MCANOVA)

# Genotype map
 data(geno_map_example)

# Initialize MAP and define segments
 minSNPs <- 10
 minBP <- 10e3
 MAP_example <- geno_map_example
 MAP_example$segments <- getSegments(MAP_example$base_pair_position, chr = MAP_example$chromosome, minBPSize = minBP, minSize = minSNPs, verbose = TRUE)

Back

Running MC-ANOVA

This example requires the R package BGData which is installed along with the MCANOVA package:

# Load necessary packages
# install.packages("BGData")
# library(BGData)
 library(MCANOVA)
 library(BGData)

# Set seed
 set.seed(12345)

# Generate genotypes (100 subjects and 500 SNPs)
 n <- 100
 p <- 500
 X <- matrix(sample(0:2, n * p, replace = TRUE), ncol = p)
 data(geno_map_example)
 colnames(X) <- geno_map_example$SNPs
 minSNPs <- 10
 minBP <- 10e3
 MAP_example <- geno_map_example
 MAP_example$segments <- getSegments(MAP_example$base_pair_position, chr = MAP_example$chromosome, minBPSize = minBP, minSize = minSNPs, verbose = TRUE)

# Assign ancestry IDs (80% to ancestry 1, 20% to ancestry 2)
 n_1 <- round(0.8 * n)
 n_2 <- round(0.2 * n)
 ancestry <- rep(c("Group_1", "Group_2"), times = c(n_1, n_2))
 rownames(X) <- ancestry

# Initialize portability estimates
 MAP_example$correlation_within <- NA
 MAP_example$correlation_across <- NA

# Set parameters for MC-ANOVA

# a small constant added to the diagonals of X'X to avoid numerical errors when some SNPs are in perfect LD
 lambda <- 1e-8
# number of Monte Carlo simulations
 nRep <- 300
# number of causal variants
 nQTL <- 3

# Loop over segments and run MC-ANOVA
 for (i in min(MAP_example$segments):max(MAP_example$segments)) {
   core <- which(MAP_example$segments == i)
   flank_size <- 10
   chunk_start <- max(min(core) - flank_size, 1)
   chunk_end <- min(max(core) + flank_size, nrow(MAP_example))
   chunk <- chunk_start:chunk_end
   isCore <- chunk %in% core
  
   X_1 <- X[rownames(X) == "Group_1", chunk]
   X_2 <- X[rownames(X) == "Group_2", chunk]
  
   # Run MC-ANOVA
   out <- MC_ANOVA(X = X_1, X2 = X_2, core = which(isCore), lambda = lambda, nQTL = nQTL, nRep = nRep)
  
   # Extract portability estimates
   MAP_example$correlation_within[chunk[isCore]] <- out[1, 1]
   MAP_example$correlation_across[chunk[isCore]] <- out[2, 1]
 }

 RA <- MAP_example$correlation_across^2/MAP_example$correlation_within^2

Back

Expected outputs

MAP_example: a 500 x 10 data frame with columns 1-3 containing variant information (chromosome, RS ID, and base pair position), column 4 containing the numeric segment the SNP belongs to estimated from getSegments(), column 5 containing the within-ancestry group MC_ANOVA() correlation estimates, and columns 6 containing the across-ancestry group MC_ANOVA() correlation estimates.
RA: a length 500 vector of Relative Accuracy estimates.
Interactive shiny app interface.

All demos should take only a few seconds to run.

System Requirements

Depends: R (>= 3.5.0) Imports: shiny, ggplot2, BGData

References

Wang et al.(Nat. Comm., 2020) Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations.
Lupi, A.S., Vazquez, A.I. & de los Campos, G. Mapping the relative accuracy of cross-ancestry prediction. Nat Commun 15, 10480 (2024). https://doi.org/10.1038/s41467-024-54727-8

Name		Name	Last commit message	Last commit date
Latest commit History 298 Commits
R		R
data		data
man		man
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MCANOVA Package

Installation

Examples

Loading the Relative Accuracy maps into an R session

Launching the Shiny App

Creating chromosome segments of a minimum base pair length and size (# of SNPs).

Running MC-ANOVA

Expected outputs

System Requirements

References

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

lupiA/MCANOVA

Folders and files

Latest commit

History

Repository files navigation

MCANOVA Package

Installation

Examples

Loading the Relative Accuracy maps into an R session

Launching the Shiny App

Creating chromosome segments of a minimum base pair length and size (# of SNPs).

Running MC-ANOVA

Expected outputs

System Requirements

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages