-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use-case: whole genome duplication / polyploidy #15
Comments
@hyanwong Cases in which it is relevant to "identify the four chromosomes in a tetraploid as consisting of two separate pairs" are in plant genetics. For allotetraploidy, which is the case you're considering, you'd have homeologous pairs (the e is not a typo), something like AABB. Their segregation, assortment and crossover propensities have added complexities in which the homeology matters. You could think of it as more elaborate phases within a genome. These ploidy shenanigans you describe in this thread are rampant amongst plants; often (iterated) hybridization is the culprit. For some reason botanists don't follow Mayr's species concept. Here's a fresh case I just found. https://www.nature.com/articles/s41467-023-38829-3 In my experience, I've found that cases with exotic ploidy are never what they seem--- ie, a hexaploid is never really 6n, but something crazy like AABBDD (this is the case in one kind of wheat). https://www.nature.com/articles/nature11997 Hope that's of some help! |
Once we fix #103, I think we should be fine to simulate evolution under different ploidies. For each node we will need to define (potentially different) chromosome IDs, but I think that's fine. The chromosome IDs need not correspond to the chromosome "numbers" used by cytologists. This would be fine to model allopolyploidy, and also for tackling things like #12, where all the chromosomes for a given "node" (i.e. that originate from one of the gametes, e.g. the "paternal chromosomes") are different from each other, and do not pair with each other in meiosis, but only pair with genetic material that came from the "other" gamete (i.e. the "maternal chromosomes"). What we might struggle to do, in the current framework, is allow recombination between two chromosomes associated with the same node (i.e. 2 duplicated maternal chromosomes). This is the sort of thing that can happen in autopolyploidy. Reworking the simulation framework to allow for this would require some messing around with the find_mrca_regions routine. In particular, we would need to pass in chromosomes to |
@szhan suggests another use-case for a GIG: to simulate whole genome duplication (WGD) followed by gene loss, so that we can look at the concept of orthodoxy / paralogy. This would require some element of selection to remove duplicates at random from one of the duplicated genomes. I can imagine a simple simulator, with only one (haploid) chromosome, initialised using an
msprime
simulation (see #14) where the chromosome gets duplicated somehow. The simulator would then gradually delete regions at random. Crossovers would occur between non-deleted regions.A more sophisticated simulation would reduce crossover between chromosomes that became "too different" somehow, either due to accumulation of mutations or due to synteny being lost. This could be the hardest part of a simulation to implement.
The "classic" scenario, according to @szhan, is to "have diploids and tetraploids with gene flow between then via triploids", i.e. "a single species with multiple cytotypes segregating".
It is unclear to me when it would become necessary to identify the four chromosomes in a tetraploid as consisting of two separate pairs, or what this means for using chromosome IDs in the GIG format (see #11 )
The text was updated successfully, but these errors were encountered: