08 STAR alignment

The alignment of the RNA-seq reads was done using STAR. STAR aligns RNA-seq to a reference, the problem with doing this is that that the reference genome contains introns that are no longer present in the RNA and therefore the reads may need to be split to align properly to the reference. STAR is able to detect these cases and do the splitting. STAR can also to some degree align RNA to the reference event hough some parts of an adaptor from illumina is still present in the RNA or if there is a mismatch to the reference.

Questions:

What percentage of your reads map back to your contigs? Why do you think that is?

The percentage varied a bit between the runs and samples: SRR6040092: 93.37% SRR6040092: 90.12% SRR6040094: 93.68% SRR6040095: 94.21% SRR6040096: 92.71% SRR6040097: 91.68% SRR6156066: 91.21% SRR6156067: 90.29% SRR6156069: 90.39%

The reason why not 100% of the reads are matched can be because the genome is not complete and therefore not all the reads can be mapped. Some reads can also be too short to be able to match.

What potential issues can cause mRNA reads not to map properly to genes in the chromosome? Do you expect this to differ between prokaryotic and eukaryotic projects?

There can be multiple splice variants of a gene and therefore the mRNA that comes from the same gene can map differently and therefore the mRNA can match improperly. This differ between procaryotic and eucaryotic projects because a lot of procaryotic organisms don't have introns while eucaryotic organisms do.

What percentage of reads map to genes?

No percentage of this can be found since we don't know at this point what is a gene and what is something else. The reads can map to sequences that are to form the ribosome and tRNA and other RNA molecules.

How many reads do not map to genes? What does that mean? How does that relate to the type of sequencing data you are mapping?

This can not be answered due to that we don't know what is a gene and what isn't.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

08 STAR alignment

08 STAR alignment

Questions:

What percentage of your reads map back to your contigs? Why do you think that is?

What potential issues can cause mRNA reads not to map properly to genes in the chromosome? Do you expect this to differ between prokaryotic and eukaryotic projects?

What percentage of reads map to genes?

How many reads do not map to genes? What does that mean? How does that relate to the type of sequencing data you are mapping?

Clone this wiki locally