-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Motivation: Many experimental evolution studies compare which genes are being mutated in one or more treatments to understand which ones are driving adaptation to a specific environment.
Implementation: As a post-processing script that is separate from breseq. Use Python or R to calculate similarity statistics and produce relevant graphs. Input is a set of input GenomeDiff files that have #=TREATMENT metadata lines added. The first version should implementing calculations from Figure 2 of Deatherage et al. Specificity of genome evolution in experimental populations of Escherichia coli evolved at different temperatures. Proc. Natl. Acad. Sci. U. S. A. 114: E1904–E1912. [PubMed]. @danieldeatherage can assist by sending his existing code that needs to be adapted.
- Input a set of GenomeDiff files (use relevant Python/R package for reading this information in, don't re-implement this step!)
- Use Biopython/Bioconductor to associate each mutation with a gene using the rules from the publication.
- Replicate shuffling analysis to give p-values for Dice similarity differences between all samples in different treatments (Figure 2A).
- Replicate analysis that predicts which genes are significantly associated with one or more treatments (Figure 2B)
- Apply code to new experiments and come up with expected results and tests.