scGate is a computational framework that automates marker-based purification of specific cell populations from heterogeneous single-cell RNA-seq datasets. Unlike reference-based annotation methods, scGate operates without requiring training data or reference gene expression profiles, making it highly flexible and applicable across diverse experimental contexts.
📚 Documentation: https://zaoqu-liu.github.io/scGate/
scGate implements a three-stage computational pipeline:
- Signature Scoring via UCell: Robust rank-based quantification of gene signature enrichment
- k-Nearest Neighbor Smoothing: Noise reduction through local averaging in reduced dimensional space
- Hierarchical Decision Trees: Multi-level filtering analogous to flow cytometry gating strategies
For a comprehensive description of the underlying mathematics, including:
- UCell score computation:
$U_c(S) = 1 - \frac{\sum_{g \in S} \min(R_c(g), R_{max})}{|S| \cdot R_{max}}$ - kNN-weighted smoothing with exponential decay
- Matthews Correlation Coefficient for performance evaluation
Please refer to our Algorithm Documentation.
install.packages("scGate")install.packages("scGate", repos = c("https://zaoqu-liu.r-universe.dev", "https://cloud.r-project.org"))# install.packages("remotes")
remotes::install_github("Zaoqu-Liu/scGate")library(scGate)
library(Seurat)
# Load example dataset
data(query.seurat)
# Define a gating model for B cells
bcell_model <- gating_model(name = "Bcell", signature = c("MS4A1", "CD19"))
# Apply scGate
query.seurat <- scGate(
data = query.seurat,
model = bcell_model,
reduction = "pca"
)
# Examine results
table(query.seurat$is.pure)
DimPlot(query.seurat, group.by = "is.pure")| Resource | Description |
|---|---|
| Quick Start Guide | Installation and basic usage |
| Algorithm & Methods | Mathematical framework and computational details |
| Visualization Guide | Plotting and figure generation |
| Advanced Usage | Hierarchical models, multi-class annotation, optimization |
A curated database of validated gating models is available:
# Download model database
models_db <- get_scGateDB()
# Available human models
names(models_db$human$generic)
#> [1] "Bcell" "CD4T" "CD8T" "Myeloid" "NK" "Tcell" ...
# Apply a pre-defined model
seurat_obj <- scGate(seurat_obj, model = models_db$human$generic$Tcell)scGate supports simultaneous annotation of multiple cell populations:
# Define model list
models <- list(
"Bcell" = models_db$human$generic$Bcell,
"Tcell" = models_db$human$generic$Tcell,
"Myeloid" = models_db$human$generic$Myeloid
)
# Multi-class annotation
seurat_obj <- scGate(seurat_obj, model = models)
# Results in scGate_multi column
table(seurat_obj$scGate_multi)If you use scGate in your research, please cite:
Andreatta M, Berenstein AJ, Carmona SJ. scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets. Bioinformatics. 2022;38(9):2642-2644. doi:10.1093/bioinformatics/btac141
@article{andreatta2022scgate,
title={scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets},
author={Andreatta, Massimo and Berenstein, Ariel J and Carmona, Santiago J},
journal={Bioinformatics},
volume={38},
number={9},
pages={2642--2644},
year={2022},
publisher={Oxford University Press},
doi={10.1093/bioinformatics/btac141}
}This project is licensed under GPL-3.0.
Maintained by Zaoqu Liu | Original authors: Massimo Andreatta, Ariel Berenstein, Santiago Carmona

