Skip to content

6. Obtaining read count from alignment

naoto-hikawa edited this page Mar 17, 2022 · 5 revisions

Alignment to read count

1. Indexing the bam file

First, the output file Aligned.sortedByCoord.out.bam needs to be indexed using samtools. Let's check again if we have the software:

samtools --version

Which should come back with samtools 1.15 or something similar (+extra output of command lists).

Now, we will index and create Aligned.sortedByCoord.out.bam.bai file.

samtools index $SEQ_HOME/results/STAR_align/Aligned.sortedByCoord.out.bam

If you see $SEQ_HOME/results/STAR_align/Aligned.sortedByCoord.out.bam.bai, you're all set!

2. bam to read count

Let's re-check that you have the featureCounts software on Terminal:

$SEQ_HOME/tools/subread-2.0.3-macOS-x86_64/bin/featureCounts -v

You should see something like featureCounts v2.0.3.

Converting to read count is done by:

$SEQ_HOME/tools/subread-2.0.3-macOS-x86_64/bin/featureCounts -a $SEQ_HOME/annotations/dmel-all-r6.44.gtf -o $SEQ_HOME/results/readcount.txt $SEQ_HOME/results/STAR_align/Aligned.sortedByCoord.out.bam -T 4 

Now, the result file is difficult to analyze as it is. Let's cut unnecessary columns on Terminal:

cut -f 1,7 $SEQ_HOME/results/readcount.txt > $SEQ_HOME/results/readcount_clean.txt           

You will now have a cleaner looking text file ready to be imported on R.

Let's look at how to use Python to process more than 1 file automatically!

Clone this wiki locally