Skip to content

Calling Structural Variants

Michael Alonge edited this page Feb 22, 2019 · 4 revisions

For your convenience, RaGOO can optionally align pseudomolecules back to the provided reference genome and call structural variants with an integrated version of Assemblytics. Though we use the Assemblytics source code, the underlying alignments are generated by Minimap2 instead of nucmer. To enable structural variant calling, use the -s flag when running ragoo.py. Please note that calling structural variants requires Minimap2 alignments, as opposed to mappings, and therefore will add a decent amount of time to the overall process.

Output

The output files for structural variant calling can be found in ragoo_output/pm_alignments. The Assemblytics calls are in assemblytics_out.Assemblytics_structural_variants.bed. This Assemblytics bed file can be converted to a VCF file using SURVIVOR (convertAssemblytics), however, the last two columns need to be removed.

The last two columns represent the percentage of a variant that overlaps with a gap in either the reference or the query. I recommend first using Pandas or R to filter variants by these two columns. After filtering, one can remove those two columns and convert to VCF. Of course, if one chooses to keep all variants, regardless of their overlap with gaps, one can just remove those two columns.

Using Other Variant Callers

RaGOO provides whole genome alignments of newly created pseudomolecules against the reference in both SAM and delta format. The alignments are generated using the following Minimap2 parameters:

minimap2 -ax asm5 --cs

That produces the pm_against_ref.sam SAM file, which is converted to a delta file (pm_against_ref.sam.delta)

This means that one can use these alignments for other downstream tools if desired. For example, one can use the delta file to run Assemblytics themselves with custom parameters. Or one can use the SAM file to call variants with paftools.

Calling variants with paftools

paftools is a great utility that accompanies Minimap2. Amongst other things, it can use our alignments to call structural variants. The following should do the trick.

paftools.js sam2paf pm_against_ref.sam > asm.paf
sort -k6,6 -k8,8n asm.paf > asm.srt.paf
paftools.js call asm.srt.paf > asm.var.txt

It is also straightforward to get variants in VCF format if one reads the paftools documentation.

Back to Wiki Home