-
Notifications
You must be signed in to change notification settings - Fork 260
Closed
Description
I have discovered that symbolic variants' END position isn't being considered when running bcftools merge, thus creating a vertical merge.
The documentation states that the command isn't intended for vertical merges, which I believe implies it will not perform a vertical merge, but it is performing a vertical merge, sometimes.
Example
A.vcf
##fileformat=VCFv4.1
##contig=<ID=chr1,length=248956422>
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="SV length">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of structural variation">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample
chr1 147022730 SV1 N <DEL> . PASS SVLEN=-570334;END=147593064 GT 0/1
B.vcf
##fileformat=VCFv4.1
##contig=<ID=chr1,length=248956422>
##FILTER=<ID=PASS,Description="All filters passed">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="SV length">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of structural variation">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Other
chr1 147022730 SV2 N <DEL> . PASS SVLEN=-990414;END=148013144 GT 1/1
bcftools merge --no-index -m none A.vcf B.vcf
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr1,length=248956422>
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="SV length">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of structural variation">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##bcftools_mergeVersion=1.21+htslib-1.21
##bcftools_mergeCommand=merge --no-index -m none A.vcf B.vcf; Date=Tue Jan 28 15:57:35 2025
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample Other
chr1 147022730 SV1;SV2 N <DEL> . PASS SVLEN=-570334;END=147593064 GT 0/1 1/1
A temporary work around is to use -m id
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr1,length=248956422>
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="SV length">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of structural variation">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##bcftools_mergeVersion=1.21+htslib-1.21
##bcftools_mergeCommand=merge --no-index -m id A.vcf B.vcf; Date=Tue Jan 28 15:58:26 2025
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample Other
chr1 147022730 SV1 N <DEL> . PASS SVLEN=-570334;END=147593064 GT 0/1 ./.
chr1 147022730 SV2 N <DEL> . PASS SVLEN=-990414;END=148013144 GT ./. 1/1
However, assigning unique IDs to variants across files/experiments is non-trivial.
Note that vertical merging happens with/without --no-index.
Original reporter: ACEnglish/truvari#256
Metadata
Metadata
Assignees
Labels
No labels