Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference from real data: useful references #119

Open
hyanwong opened this issue Mar 31, 2024 · 2 comments
Open

Inference from real data: useful references #119

hyanwong opened this issue Mar 31, 2024 · 2 comments

Comments

@hyanwong
Copy link
Owner

hyanwong commented Mar 31, 2024

Inferring a GIG from real data is probably going to be the most difficult part of the entire GIG project. This issue is to collect ideas and references.

For a start, I've just come across the paper/software below which references various approaches for constructing simple trees from k-mers. It strikes me that we might have to use a k-mer approach for GIG inference, as this is the only way we will be robust to different coordinate systems, so I wonder if there is anything we can use from these ideas. A web search for alignment-free phylogeny will probably go a long way here:

https://pubmed.ncbi.nlm.nih.gov/38547397/

Also PanMAN gives a nice example of running an algorithm to produce an ancestry with structural variation, for limited recombinant ancestries such as for SARS-CoV2

https://www.biorxiv.org/content/10.1101/2024.07.02.601807v1.full.pdf+html

@hyanwong
Copy link
Owner Author

Raw data is available for 1000G at https://www.biorxiv.org/content/10.1101/2024.04.18.590093v1

@hyanwong
Copy link
Owner Author

Richard Durbin also has a new preprint out about TE insertions in real data: https://www.biorxiv.org/content/10.1101/2024.04.05.588311v1.full

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant