Skip to content

Subtypist is a computational toolkit for subtype identification of single-cell transcriptomic data without reference.

Notifications You must be signed in to change notification settings

ZJUFanLab/Subtypist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Subtypist

R >= 4.0 Seurat >= 4.0.0

Reference-free identification of cell subtypes for single-cell transcriptomic data

curation

Identification of novel cell subtypes is critically crucial in revealing the pathogenesis and heterogeneity of disease, which provides unprecedented insights into the development of therapeutic strategies. Although numerous cell-type identification methods exist, these methods heavily rely on the reference with the fixed cell labels, which fail to uncover new cell subtypes marked with phenotypic molecules within a specific disease context. To fill this gap, we propose a pioneering reference-free annotation method, Subtypist, to identify disease-associated cell subtypes expressing phenotypic features using an ensemble-based strategy.

Install

  • install dependent packages devtools
> install.packages(pkgs = 'devtools')
  • then install Subtypist
> devtools::install_github('ZJUFanLab/Subtypist')

# or download the repository as ZIP
> devtools::install_local("/path/to/Subtypist-main.zip")

usage

  1. Loading or subsetting specific cell type
# Load Seurat object from a full dataset and subset specific cell types
> FullObject <- Load("FullObject.RData")
> Seu <- subset(FullObject, idents = c("T cells")

# Alternatively, load a pre-processed Seurat object with annotated cell types, derived by subsetting the full dataset
> # Seu <- readRDS("Seu.rds")
  1. Cell type identification using an ensemble strategy without reference
# Object: a Loading or subsetting specific cell type Seurat object
# min.resolution: the minimum value of resolution
# max.resolution: the maxium value of resolution
# use.assay: Name of assay to use
# cluster.assay: Name of the assay in the Seurat object to use for clustering
> result <- Subtypist_merge(object=Seu,min.resolution=0.3,max.resolution=1.5,by=0.1,use.assay="RNA",cluster_assay = "RNA")

# Show results
> print(result)
$Object
An object of class Seurat 
1000 features across 1000 samples within 1 assay 
Active assay: RNA (1000 features, 985 variable features)
 3 dimensional reductions calculated: pca, umap, tsne
$result.table
   resolution merge_cluster initial_cluster       molecular_phenotype     Score
1         0.4             0               0   Gene7, Gene679, Gene990 2.0322883
2         0.4             1               1 Gene570, Gene807, Gene470 0.6984284
3         0.4             2               2 Gene871, Gene559, Gene247 3.3095874
4         0.4             3               3 Gene474, Gene746, Gene790 3.5678922
5         0.4             4               4 Gene776, Gene507, Gene323 2.9614237
6         1.1             0               0   Gene7, Gene679, Gene990 2.0341814
7         1.1             1               1 Gene871, Gene559, Gene247 3.3095874
8         1.1             2             2,5 Gene470, Gene807, Gene243 0.0000000
9         1.1             3               3 Gene474, Gene746, Gene577 3.4186799
10        1.1             4             4,7 Gene807, Gene801, Gene566 0.0000000
11        1.1             5               6 Gene776, Gene507, Gene323 2.9614237
  1. Evaluating clustering resolutions and annotating subtypes with specific phenotypic markers
# To evaluate and rank clustering resolutions based on their corresponding subtype identification results.
> sortScore(result$result.table) ## resolution = 0.4: highest 
# A tibble: 5 × 2
  resolution value
       <dbl> <dbl>
1        0.1  1.62
2        0.2  1.85
3        0.4  2.51
4        1.1  1.95
5        1.2  1.81

> # Add the result to the object 
> Seu <- AddSubtypist(result$Object,result.table=result$result.table,prefix='Subtypist')
> # To assign more specific phenotypic molecules to each subtype, 
> # the `select_index` parameter can be used to specify which gene to select 
> Seu <- Subtypist::AddSubtypist(result$Object,resolution=c(0.4),result.table=result$result.table,prefix = 'Subtypist',meta.prefix = 'phenotypic melocules_',value.suffix='+ B',select_index=c('0'=1,'1'=1,'2'=1,'3'=2,'4'=3))
> print(unique(Seu@meta.data['phenotypic melocules_0.4'])
> [1] "Gene570+ B" "Gene7+ B"   "Gene746+ B" "Gene871+ B" "Gene323+ B"

Plot

  • Visualize subtype-level distributions across dimensionality reduction space (e.g., UMAP) using Subtypist_Dimplot(). This function overlays the specified phenotypic molecules annotations—derived at selected clustering resolutions—onto the Seurat object. For example:
> p <- Subtypist::Subtypist_Dimplot(Seu,result.table = result$result.table,resolution = c(0.4,1.1), show = "molecular_phenotype_",prefix = 'Subtypist')

About

Subtypist was developed by Yue Yao. Should you have any questions, please contact Yue Yao at yuey@zju.edu.cn

About

Subtypist is a computational toolkit for subtype identification of single-cell transcriptomic data without reference.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages