Skip to content

R package expanding integrative analysis capabilities of Seurat by providing seamless access to popular integration methods and to an integration benchmarking toolkit.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

cbib/Seurat-Integrate

Repository files navigation

SeuratIntegrate

R package expanding integrative analysis capabilities of Seurat by providing seamless access to popular integration methods. It also implements an integration benchmarking toolkit that gathers well-established performance metrics to help select the most appropriate integration.

Examples, documentation, memos, etc. are available on the SeuratIntegrate's website.

SeuratIntegrate provides support to R- and Python-based integration methods. The table below summarizes which methods are compatible with SeuratIntegrate:

Table 1: Supported integration methods
Package Method Function
R SeuratIntegrate ComBat CombatIntegration()
Harmony HarmonyIntegration()
MNN MNNIntegration()
Seurat CCA CCAIntegration()
RPCA RPCAIntegration()
SeuratWrappers FastMNN
(batchelor)
FastMNNIntegration()
Python SeuratIntegrate BBKNN bbknnIntegration()
scVI scVIIntegration()
scANVI scANVIIntegration()
Scanorama ScanoramaIntegration()
trVAE trVAEIntegration()

Installation

Install SeuratIntegrate from github directly:

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
if (!require("remotes", quietly = TRUE))
  install.packages("remotes")
remotes::install_github("cbib/Seurat-Integrate", dependencies = NA, repos = BiocManager::repositories()) 

Preparations

Setup Python environments

To use Python methods, run the following commands (once) to set up the necessary conda environments:

library(SeuratIntegrate)

# Create envrionments
UpdateEnvCache("bbknn")
UpdateEnvCache("scvi")     # also scANVI
UpdateEnvCache("scanorama")
UpdateEnvCache("trvae")

# Show cached environments
getCache()

Environments are persistently stored in the cache and the UpdateEnvCache() commands should not need to be executed again.

While these environments should work well in most cases, conda's dependencies occasionally encounter conflicts. Manual adjustment might be needed. You may find helpful information in this vignette.

Setup a SeuratObject

To integrate data with SeuratIntegrate, you need to preprocess your SeuratObject until you obtain at least a PCA. Importantly, the SeuratObject must have its layers split by batches.

Not familiar with Seurat?

Have a look at Seurat's website, especially the tutorials covering SCTransform and integrative analyses.


To fully benefit from the benchmarking toolkit, you'll need cell-type annotations of sufficient quality to be considered suitable as ground truth.

Facultative dependencies

The benchmarking toolkit can benefit from additional dependencies:

# required to test for k-nearest neighbour batch effects
remotes::install_github('theislab/kBET')

# fast distance computation
install.packages('distances')

# faster Local Inverse Simpson’s Index computation
remotes::install_github('immunogenomics/lisi')

SeuratIntegrate usage

Integrate datasets

When your SeuratObject is ready, you can launch multiple integrations (from Table 1) with a single command. DoIntegrate() provides a flexible interface to customise integration-specific parameters and to control over associated data and features.

seu <- DoIntegrate(seu,
       # ... integrations
         CombatIntegration(layers = "data"),
         HarmonyIntegration(orig = "pca", dims = 1:30),
         ScanoramaIntegration(ncores = 4L, layers = "data"),
         scVIIntegration(layers = "counts", features = Features(seu)),
       # ...
       use.hvg = TRUE,    # `VariableFeatures()`
       use.future = c(FALSE, FALSE, TRUE, TRUE)
)

In this example, all integration methods will use the variable features as input, with the exception of scVIIntegration() which is set to use all features (features = Features(seu)). CombatIntegration() will correct the normalised counts (layers = "data"), while scVIIntegration() will train on raw counts (layers = "counts").

use.future must be TRUE for Python methods, and FALSE for R methods (see Table 1).

Post-process integration outputs

Integration methods produce one or several outputs. Because they can be of different types, the following table indicates the post-processing steps to generate a UMAP.

Table 2: Output types and processing
Output type Object name Processing
Corrected counts Assay ScaleData()RunPCA()RunUMAP()
Dimensional reduction DimReduc RunUMAP()
KNN graph Graph RunUMAP(umap.method = "umap-learn")

Output types are summarized for each method in the Memo vignette about integration methods

Compare integrations

SeuratIntegrate incorporates 11 scoring metrics: 6 quantify the degree of batch mixing (batch correction), while 5 assess the preservation of biological differences (bio-conservation) based on ground truth cell type labels.

To score your integrations, you must process their outputs as in the Processing column of Table 2. You'll also need to get a graph by running FindNeighbors(return.neighbor = TRUE) (this vignette provides further guidance).

Then, scores can be obtained using the function Score[score_name](), or directly saved in the Seurat object using the AddScore[score_name]() as follows:

# save the score in a variable
rpca_score <- ScoreRegressPC(seu, reduction = "[dimension_reduction]")  #e.g. "pca"

# or save the score in the Seurat object
seu <- AddScoreRegressPC(seu, integration = "[name_of_integration]", reduction = "[dimension_reduction]")

It is worth noting that the unintegrated version must also be scored to perform a complete comparative analysis. When scores have been computed, they can be used to compare the integration outputs. See this vignette for a complete overview of available scores.

The advantage of the AddScore over the Score functions is that they facilitate score scaling and plotting:

# scale
seu <- ScaleScores(seu)

# plot
PlotScores(seu)

Getting help and advice

Examples, documentation, memos, etc. are available on SeuratIntegrate's website.

If you encounter a bug, please create an issue on GitHub. Likewise if you have a specific comment or question not covered on the website.

Citing

If you find SeuratIntegrate useful, please consider citing:

Specque, F., Barré, A., Nikolski, M., & Chalopin, D. (2025). SeuratIntegrate: an R package to facilitate the use of integration methods with Seurat. Bioinformatics. doi: 10.1093/bioinformatics/btaf358

About

R package expanding integrative analysis capabilities of Seurat by providing seamless access to popular integration methods and to an integration benchmarking toolkit.

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages