A comprehensive atlas of fetal splicing patterns in the brain of adult myotonic dystrophy type 1 patients
This repository stores R scripts and annotation files to reproduce the analysis of the publication "A comprehensive atlas of fetal splicing patterns in the brain of adult myotonic dystrophy type 1 patients".
Software:
- Git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
- Conda: https://docs.conda.io/en/latest/miniconda.html
Data:
- Gene read counts for GTEx V8 can be downloaded from the public portal: https://gtexportal.org/home/datasets.
- Acquire raw RNA-Seq data for the following datasets:
- Otero et al. (2021): GSE157428 (Navigate to the SRA Run Selector)
- BrainSpan Atlas of the Developing and Adult Human Brain: phs000731.v2.p1*
- Genotype-Tissue Expression (GTEx) project: phs000424.v8.p2*
- We recommend to use this pipeline for data preprocessing: https://github.com/MDegener/RNAseq-pipeline.
*protected data that requires authorized access by dbGaP
-
Checkout this git repository:
git clone https://github.com/cmbi/BrainDM1.git
-
Navigate to the repository:
cd BrainDM1
-
Create and activate conda environment with all R package dependencies
conda create -p r-env -c r -c bioconda -c conda-forge git rstudio r-base r-tidyr r-dplyr r-stringr r-purrr r-ggplot2 r-corrplot bioconductor-rtracklayer bioconductor-biomaRt bioconductor-edgeR r-matrixTests r-psych r-data.table r-dunn.test r-cairo r-statmod r-gtools r-ppcor r-argparse r-r.utils r-venndiagram r-cowplot r-gghighlight r-ggrepel r-ggpubr
conda activate r-env
-
Unzip annotation files:
gzip -d lib/SE.hg38.annotated.gff3.gz lib/gencode.v26.annotation.collapsed.gtf.gz
-
Run the following R scripts in this order:
scripts/prepareSampleMetadata.R
: Combines metadata for all selected samples into one tablescripts/prepareSummarizedData.R
: Creates a matrix of exon inclusion data for all selected samplesscripts/comparePSI.R
: Performs a group comparison of exon inclusion for all exon-skipping events
-
Now you can run any script in the
scripts/figure
orscripts/tables
directory to reproduce the unedited content of the publication
Directory | Content |
---|---|
lib | Misc files for annotation and selection of samples |
results | Unedited output of the analysis scripts that are contained in the /scripts directory |
scripts | Scripts to create summarized data and to reproduce all figures and tables of the publication |