🧬 A comprehensive end-to-end pipeline for single-cell RNA sequencing data analysis, from quality control to cell-cell communication inference.
This repository provides a complete workflow for analyzing single-cell RNA-seq data using state-of-the-art computational methods. The pipeline integrates both R/Seurat and Python/scVI frameworks to deliver robust results from raw data to biological insights.
Key Features:
- Quality control and doublet detection
- Multiple normalization strategies (log1p, SCTransform)
- Data integration using Seurat V5/Harmony and scVI
- Automated and manual cell type annotation
- Differential abundance testing with MILOR
- Multiple cell-cell communication methods
- Differential expression analysis
- Comprehensive visualization tools
1.QC_DoubletF_Log1p.qmd- QC pipeline with doublet filtering and log1p normalization1.QC_DoubletF_SCT.qmd- Alternative QC using SCTransform normalization and doubletFinderbasic_analysis_steps_MISC.Rmd- Additional QC utilities and helper functions
2.Data intergration.Rmd- Batch correction and sample integration (Seurat V5)Single_cell_scVI.ipynb- Deep learning-based integration using scVISubclustering_SCVI.ipynb- Subclustering analysis with scVIchange_scvi_continuous.ipynb- Handling continuous covariates in scVI
3.Annotation.Rmd- Automated and manual cell type annotationHigh_annotation_multicontrast.Rmd- High-resolution annotation across conditionsHigh_level_multicontrast.Rmd- Broad cell type classificationLow_annotation_multiniche_multicontrast.Rmd- Fine-grained annotation analysis
4. MILOR_dif_abud.Rmd- MILOR-based differential abundance testing
5.Subclustering.Rmd- Detailed subclustering of cell populationsSubclustering_SCVI.ipynb- Python-based subclustering with scVI
5.cc_interactions_niche.Rmd- Niche-based cell communication analysis5.cellchat.Rmd- CellChat communication inference6.cc_interactions_niche.Rmd- Extended niche interaction analysis6.cellchat.Rmd- Advanced CellChat workflows6.cellphonedb.Rmd- CellPhoneDB communication analysis6.Multinichenet.Rmd- MultiNicheNet analysis6.1.Multinichenet_output_analysis.Rmd- MultiNicheNet results interpretation
7.DEG_conditions_subtypes.Rmd- Differential gene expression across conditions and cell types
mebocost_rasV.ipynb- Metabolic cost analysis using MEBOCOST
tSNE.Rmd- t-SNE dimensionality reduction and plottingUMAP_color_Change.Rmd- UMAP visualization customizationrelative_percentages_plots.Rmd- Cell proportion visualizationobject_format_convert.Rmd- Format conversion utilities
# R packages (Seurat V5 required)
install.packages("Seurat") # Version 5.0+
install.packages(c("SingleCellExperiment", "scater"))
BiocManager::install(c("miloR", "CellChat", "MultiNicheNet"))
# Python packages
pip install scvi-tools scanpy pandas numpy matplotlib seaborn- Start with QC: Run
1.QC_DoubletF_Log1p.qmdor1.QC_DoubletF_SCT.qmd - Integrate data: Use
2.Data intergration.RmdorSingle_cell_scVI.ipynb - Annotate cells: Apply
3.Annotation.Rmd - Analyze communications: Choose from CellChat, CellPhoneDB, or MultiNicheNet scripts
- Find DEGs: Run
7.DEG_conditions_subtypes.Rmd
Quality Control:
- Doublet detection and filtering
- Mitochondrial gene filtering
- Low-quality cell removal
Normalization:
- Log1p normalization
- SCTransform
- scVI normalization
Integration Methods:
- Seurat V5 CCA/RPCA/Harmony
- scVI deep learning integration
Cell-Cell Communication:
- CellChat
- CellPhoneDB
- MultiNicheNet
- Custom niche analysis
- Mebocost
Differential Analysis:
- MILOR (differential abundance)
- Seurat V5 FindMarkers/FindAllMarkers
- edgeR/DESeq2 integration
- Quality control reports and plots
- Integrated single-cell object
- Cell type annotations
- UMAP/t-SNE visualizations
- Cell-cell communication networks
- Differential expression results
- Abundance analysis results
- Input: 10X Genomics H5/MTX, CSV, H5AD, RDS
- Output: Seurat objects (RDS), AnnData (H5AD), CSV results, HTML reports
Note: This pipeline was developed for a private murine dataset. Adjust parameters as needed for your dataset.