Skip to content

A repository containing all the file projects that I have for the Bioinformatics 6 (COMP 4312e) course.

Notifications You must be signed in to change notification settings

MikoRabago/Bioinformatics-6-COMP-4312e-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Bioinformatics 6 (COMP 4312e)

A repository containing all the file projects that I have for the Bioinformatics 6 (COMP 4312e) course.

Link to this card: https://trello.com/c/HF32d968

Finding Mutations in DNA and Proteins (Bioinformatics VI)

About this Course In previous courses in the Specialization, we have discussed how to sequence and compare genomes. This course will cover advanced topics in finding mutations lurking within DNA and proteins.

In the first half of the course, we would like to ask how an individual's genome differs from the "reference genome" of the species. Our goal is to take small fragments of DNA from the individual and "map" them to the reference genome. We will see that the combinatorial pattern matching algorithms solving this problem are elegant and extremely efficient, requiring a surprisingly small amount of runtime and memory.

In the second half of the course, we will learn how to identify the function of a protein even if it has been bombarded by so many mutations compared to similar proteins with known functions that it has become barely recognizable. This is the case, for example, in HIV studies, since the virus often mutates so quickly that researchers can struggle to study it. The approach we will use is based on a powerful machine learning tool called a hidden Markov model.

Finally, you will learn how to apply popular bioinformatics software tools applying hidden Markov models to compare a protein against a related family of proteins.

SYLLABUS Finding Mutations in DNA and Proteins (Bioinformatics VI)

Introduction to Read Mapping

Welcome to our class! We are glad that you decided to join us.

In this class, we will consider the following two central biological questions (the computational approaches needed to solve them are shown in parentheses):

  1. How Do We Locate Disease-Causing Mutations? (Combinatorial Pattern Matching)
  2. Why Have Biologists Still Not Developed an HIV Vaccine? (Hidden Markov Models)

As in previous courses, each of these two chapters is accompanied by a Bioinformatics Cartoon created by talented artist Randall Christopher and serving as a chapter header in the Specialization's bestselling print companion. You can find the first chapter's cartoon at the bottom of this message.

The Burrows-Wheeler Transform

Welcome to week 2 of the class!

This week, we will introduce a paradigm called the Burrows-Wheeler transform; after seeing how it can be used in string compression, we will demonstrate that it is also the foundation of modern read-mapping algorithms.

Speeding Up Burrows-Wheeler Read Mapping

Welcome to week 3 of class!

Last week, we saw how the Burrows-Wheeler transform could be applied to multiple pattern matching. This week, we will speed up our algorithm and generalize it to the case that patterns have errors, which models the biological problem of mapping reads with errors to a reference genome.

Introduction to Hidden Markov Models

Welcome to week 4 of class!

This week, we will start examining the case of aligning sequences with many mutations -- such as related genes from different HIV strains -- and see that our problem formulation for sequence alignment is not adequate for highly diverged sequences.

To improve our algorithms, we will introduce a machine-learning paradigm called a hidden Markov model and see how dynamic programming helps us answer questions about these models.

Profile HMMs for Sequence Alignment

Welcome to week 5 of class!

Last week, we introduced hidden Markov models. This week, we will see how hidden Markov models can be applied to sequence alignment with a profile HMM. We will then consider some advanced topics in this area, which are related to advanced methods that we considered in a previous course for clustering.

Bioinformatics Application Challenge

Welcome to the sixth and final week of class!

This week brings our Application Challenge, in which we apply the HMM sequence alignment algorithms that we have developed.

About

A repository containing all the file projects that I have for the Bioinformatics 6 (COMP 4312e) course.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published