Skip to content

norm doesn't normalize symbolic variants #1919

@davmlaw

Description

@davmlaw

Hi, thanks for bcftools!

A variant NC_000003.11:g.128204049_128206714del can be written in a VCF as having an explicit REF or via a symbolic alt

bcftools norm will left-align the former (to position 128204042), but not the symbolic representation - I think it should normalize both

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
NC_000003.11	128204048	.	GAGGAGATCAGGGAGCCATCGAAATCCCAAGATCAGACTGATTGAGTTAGAGACCCAGATTCCTTAATGGTCTGGAACCTCCGGAGTGCCTGAAACATGCACACACAACATGCACACAGACACACGTATACACATGCACACACGCTCCCAAACACATGCACATACAGAGTCACTTCCCTCGCTTCACATACTAAGTCCTGAGGTGGCTTGAATTTTCACTTAAGATTCAGGGAGGGAAAGGGGGAATTCCTGTTCTATGTTCTGGTCAGGCAGTGACCAACCCTGGGGCAAGGAACTGAACTTTGGGGGTACACTGGAAGCACTTAAGAAAATGGCAAAAGTTTTAGAGTCTCCTCTCCCTGACCCGGGGGTCTCAAACATCTGCTGGGGGCTATTAGAGCGAGACATCACCCATCCCCAGATCTGGGAAACCAACACTGCCACCTCTCCCAAGTCACAGCTCCCCACCACAAAAACGCAAATGCTCCCCTCTTCCACGAAGTCCCCAGCACCTGCCTTTACCTGAACAGGAACGAGCCTTGCTGCGCTGCTTAGGGGTGAAGCTGGAGGCCGGTCCCCCCAGGAAGCCTCCGGGGTGGAAGAGTCCGCTGCTGTAGTCGTGGGCAGCCGCCGGCACATAGGAGGGGTAGGTGGGGATGGGGTGGTGTGTAGCAGGCTGGGTGCCCATAGTAGCTAGGCCTGGGCGCAGGGGACTGCCACTTTCCATCTTCATGCTCTCCGTCAGTGACACCTGGTACTTGACGCCGTCCTTGTCCTCTCCTCGGGCTGCACTACCCCCCGCGGAAGATGAGGCTGGAGACGCAGCCCCCGTGGTGCTAGGGTCAGGAGACACTTCTTTGGGTGGCGTGGGTGGGAAGCCGAAAAGGTGGGAGCCAGAGTGGGCTGCTGTAGGGGTGAGGGAGGCCACTGAGCTCCCGCTGCCTCCCCCGCTCCCACCCCCAGCCCCTGGGTACACAGAGAGTGGGCCTCCAGGGCCTCCAGCAGCTGAGGGGTGCAGTGGCGTCTTGGAGAAGGGGCTCACGGTCCAGGGGTTGTGGTGGTGGGCCGCAGCGGCAGAGAGGGCTGCTTTGCCCCCGTCCAGCCAGGGCAAACCCGGGCTGTGCAACAAGTGTGGGCGGCACATCTGGCCTCCGGTCAGGCGGGCTGCGGGCAAAGAGAGAGAGGATCAGGGTGGGCAGAAAGATCAGGGTAGGCAGAGCTAGGGACGCCCCTGACAGACATTGAGATCACGACTCCCAGAACCAGCAGTCATCCCCTCCCCAAAGAAAGCCAGAAACATAATACCCCACCGGTAATAATCAGGAATGTCAGTCCAAGCTGAAGGACAAGTGGCATAGAAGGAACCCCACCGGACAGACCCTACAGGGAACCCTCACAGGCCAGCTGGAAGTGGGCAGAAACCCTGTGGGTCCCAGACCCTCCCCAATCGGCCGCTGCTCCCACCTCTCCCGCCCCAATTTTTCAGCAGCTCGATTCCTGCGGATCCTACATCCGGGAAGCAAGCAGACGGGCCCTCCTCCCCTCCCTCGCCTGGCGCGCGGCGCCTGGGTTCTCATCACCACGGGCCCAGTGCTCACCGTGCGCGGGGCTGTAGGAGACGCGCGCCCGCGCGTGAGCGGGGTTGGCATAGTAGGGGTTGCCCTGCGAGTCGAGGTGATTGAAGAAGACGTCCACCTCGTCTGGAGGCAGCAGCTGCGCGGGTTCCATGTAGTTGTGCGCCAGGCCCGGGTGGTGTGAGTCGGGGTGCTGCGCATTCAGCACGGCCGGGTGCGCCATCCAGCGCGGCTGCTCGGGCGCCACCTCCATGGCCGGCGGCGGCGGCTCAGGGTCTGGGTGCAGACGGCAACGGCCCTGCGCGAGGAAGGGGGAGTGAGGCGTGCCGCCAGCGCCTGACACCCCCCAAAGTCCCACCACGAGGTGTCCCGCACGCCACGGAGCCCCAGCCCAGATCCGGCGAGAAAGAGCACCAGTCCCGGGTGGGAGGAAAGCCCAAGGCTCAAAACGAAAGGAAGGCGGGGGAGGGGGTTCAGCCACGCACACTCACGTGGTGACCCGCGGCTCCAGAATCACACACCCGTGCACATGGGGTCACGCCCGGGGACGGGTCCCGACACCAGTGACCCCAACAAACGCACAGAGCAGCACTTCAGTCAGACACTCACACTGAGCCCCCCCGCCCGGTAGACAAACACATGAACACAGACTCAAAAGTTGGAGACAGGCGCCCGGGCACCCAGTGTGGCACTTGATCCCAGCGACACGCACACACCCACACTTGGCGCCAGATACACATACTGATCTCAACCCCGAAAACATGCACACGCAGCCCCCTGAGCGCAGTACTAAGCGGCACAATCAGGACCTCTCAACAAAGCACACCAAAGCAGTCGCCCGCAGCCTGGCCCCCCGCCCTAAGTCCCCCCAGAGTCCCCTCAAAGCTAGGAGCGCCCCAGGCCCCCAGCCGGCTCTCAAACCCCAAACTTACACACGCAGCCGTGGGGAGGGGAGGGACTCGGCCTCTGAGAGTGAAGGAGTTCCGGCGGGAGCCCCGAGGGCGACGGGCCCAGGGACAGCACGTCCGGAGGCTGGCGGGGCTTACAGGGTAGGAGCTGGGGGTAGAGTGCGCCTCGGCCTCGGGCCCTCCCG	G	.	PASS	.
NC_000003.11	128204048	.	G	<DEL>	.	PASS	END=128206714;SVTYPE=DEL;SVLEN=-2666

I have attached this as a vcf (with .txt extension)
indel_normalise_test.GRCh37.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions