Skip to content

Conversation

DonFreed
Copy link
Contributor

Switch from a fixed distance to an adaptive distance metric for clustering variants. Fixes #5.

- Remove `--maxdist` and replace it with a function that is dependent on the SV size
- Update the grouper to use a more sophisticated variant queue
- Add a tandem repeat (TR) bed input file. Variant groups that enter a TR region are automatically extended to the end of the tandem repeat.
This file was originally downloaded from the Tandem Repeats Database for hg38, "Homo sapiens HG38 (2,5,7,50, centr. excluded) Full Genome", sorted by chromosome and position. Basic filtering was applied to remove some repeats (size < 200 or match_perc < 80.5 or score < 300). The file was then converted to BED format and overlapping regions were merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant