scMetric is an R package that applies a metric learning algorithm to scRNA-seq data. It allows users to give weakly annotated samples to tell expected angle they would use to analyze the data, and the package learns the metric from the examples and apply the metric for downstream clustering and visualization. The package also outputs the genes that are weighted as more important in learned metric.
For more information, please refer to the manuscript by Wenchang Chen and Xuegong Zhang.
The package is developed . Users should . To install the developmental version from GitHub:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("chenwenchang/scMetric", build_vignettes = TRUE)
To load the installed scMetric in R:
library(scMetric)
scMetric takes 7 inputs:
X: a scRNA-seq gene expression matrix, cells for rows and genes for columnslabel: a vector specifying which group cells belong to,corresponding to rows in X.constraints: weak supervision information, a few pairs of cells along with whether they are similar or notnum_constraints: total number of similar and dissimilar pairs that are usedthresh: threshold that decides when metric learning iteration stops. Default: 0.01max_iters: max iterations of metric learning. Default: 100000draw_tSNE: whether to draw tSNE plot or not
If users provide constraints themselves, the input label is used for visualization only. If users want scMetric to select constraints automatically, then label is used for selecting similar and dissimilar pairs. Cells that have the same label are similar. Otherwise, they are dissimilar.
Default num_constraints value is 100. Users should give a number for particular use.
Users can load the test data in scMetric by
library(scMetric)
data(testData)
The toy data counts in testData is a scRNA-seq read counts matrix which has 1000 cells (rows) and 1000 genes (columns). The object label1 and label2 are two vectors specifying two kinds of grouping.
Here is an example to run scMetric with read counts matrix input:
# Load library and the test data for DEsingle
library(scMetric)
data(testData)
# Learning metric using label1 as similarity
res <- scMetric(counts, label = label1, num_constraints = 50, thresh = 0.1, draw_tSNE = TRUE)
scMetric outputs 4 objects:
newData: new data based on new metric which can be used for downstream analysisnewMetric: learned metric, a d by d matric where d represents genes numbersconstraints: constraints whichscMetricusessortGenes: genes sorted by importance score
- Wenchang Chen - wrote
scMetricand analyzed data - Xuegong Zhang - planned the study
This work is supported by CZI HCA pilot project, the National Key R&D Program of China grant 2018YFC0910400 and the NSFC grant 61721003.
Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S. Dhillon. "Information-theoretic Metric Learning." Proc. 24th International Conference on Machine Learning (ICML), 2007.