A computational pipeline for Circular and Linear RNA Expression Analysis from Ribosomal-RNA depleted (Ribo–) RNA-seq (CLEAR/CIRCexplorer3)
- Software
- CIRCexplorer2 (>=2.3.6)
- HISAT2 (>=2.0.5)
- StringTie (>1.3.6)
- Package (python 2.7 +)
- pysam (>=0.8.4)
- pybedtools
git clone https://github.com/YangLab/CLEAR
cd CLEAR
python ./setup.py installStart from fastq file:
usage: clear_quant [-h] -1 M1 [-2 M2] -g GENOME -i HISAT -j BOWTIE1 -G GTF
[-o OUTPUT] [-p THREAD]
optional arguments:
-h, --help Show this help message and exit.
-1 M1 Comma-separated list of read sequence files in FASTQ
format. When running with pair-end read, this should
contain #1 mates.
-2 M2 Comma-separated list of read sequence files in FASTQ
format. -2 is only used when running with pair-end
read. This should contain #2 mates.
-g GENOME, --genome GENOME
Genome FASTA file.
-i HISAT, --hisat HISAT
Index files for HISAT2.
-j BOWTIE1, --bowtie1 BOWTIE1
Index files for TopHat-Fusion.
-G GTF, --gtf GTF Annotation GTF file.
-o OUTPUT, --output OUTPUT
The output directory.
-p THREAD, --thread THREAD
Running threads. [default: 5]
Start from CIRCexplorer2 output file:
usage: circ_quant [-h] -c CIRC -b BAM -r REF [--threshold THRESHOLD]
[--ratio RATIO] [-l] [-t] [-o OUTPUT]
optional arguments:
-h, --help Show this help message and exit.
-c CIRC, --circ CIRC Input circular RNA file from CIRCexplorer2.
-b BAM, --bam BAM Input mapped reads from HISAT2 in BAM format.
-r REF, --ref REF The refFlat format gene annotation file.
--threshold THRESHOLD
Threshold of FPB for choose circRNAs to filter linear
SJ.[default: 1]
--ratio RATIO The ratio is used for adjust comparison between circ
and linear.[default: 1]
-l, --length Whether to consider all reads' length? [default: False]
-t, --tmp Keep tmp dir? [default: False]
-o OUTPUT, --output OUTPUT
Output file. [default: circRNA_quant.txt]
Start from fastq file:
clear_quant -1 mate_1.fastq -2 mate_2.fastq -g hg38.fa -i hg38.hisat_index -j hg38.bowtie_index -G annotation.gtf -o output_dirStart from CIRCexplorer2 output file:
circ_quant -c CIRCexplorer2_output.txt -b hisat_aligned.bam -t -r annotation.refFlat -o quant.txthisat_aligned.bam should not contain unmapped reads.
- output_dir/quant/quant.txt
| Field | Description |
|---|---|
| chrom | Chromosome |
| start | Start of circular RNA |
| end | End of circular RNA |
| name | Circular RNA/Junction reads |
| score | Flag of fusion junction realignment |
| strand | + or - for strand |
| thickStart | No meaning |
| thickEnd | No meaning |
| itemRgb | 0,0,0 |
| exonCount | Number of exons |
| exonSizes | Exon sizes |
| exonOffsets | Exon offsets |
| readNumber | Number of junction reads |
| circType | Type of circular RNA |
| geneName | Name of gene |
| isoformName | Name of isoform |
| index | Index of exon or intron |
| flankIntron | Left intron/Right intron |
| FPBcirc | Expression of circRNA |
| FPBlinear | Expression of cognate linear RNA |
| CIRCscore | Relative expression of circRNA |
Ma XK*, Wang MR, Liu CX, Dong R, Carmichael GG, Chen LL and Yang L#. A CLEAR pipeline for direct comparison of circular and linear RNA expression. 2019, bioRxiv doi: 10.1101/668657
Copyright (C) 2019 YangLab. Licensed GPLv3 for open source use or contact YangLab (yanglab@@picb.ac.cn) for commercial use.
