Skip to content

Issue with Bcftools Merge Commands Before Truvari Collapse #256

@yueyaog

Description

@yueyaog

Hi @ACEnglish,

Thank you for creating Truvari—it has been incredibly helpful for SV analysis! However, I encountered an issue with the recommended bcftools command for truvari collapse outlined on the wiki page.

To start, we merge multiple VCFs (each with their own sample) and ensure there are no multi-allelic entries via:

bcftools merge -m none one.vcf.gz two.vcf.gz | bgzip > merge.vcf.gz

However, this command can merge non-identical SV events into a single record if they share the same chr and start position. This results in records with different variants being collapsed together in one line, which could lead to interpretation issues with truvari collapse later. For example:

>> one.vcf:
chr1	147022730	DRAGEN:LOSS:chr1:147022731-147593064	N	<DEL>	84	PASS	SVLEN=-570334;SVTYPE=CNV;END=147593064;REFLEN=570334;RIGHT_BND=DRAGEN:DEL:6916:0:1:0:0:0;OrigCnvEnd=147593080;SVCLAIM=DJ	GT:SM:CN:BC:GC:CT:AC:PE	0/1:0.506255:1:456:0.403456:0.498899:0.503649:3,12
>> two.vcf:
chr1	147022730	DRAGEN:LOSS:chr1:147022731-148013144	N	<DEL>	93	PASS	SVLEN=-990414;SVTYPE=CNV;END=148013144;REFLEN=990414;SVCLAIM=D	GT:SM:CN:BC:GC:CT:AC:PE	0/1:0.513739:1:751:0.411265:0.499147:0.502668:1,1

>> merged.vcf
chr1	147022730	DRAGEN:LOSS:chr1:147022731-147593064;DRAGEN:LOSS:chr1:147022731-148013144	N	<DEL>	93	PASS	SVLEN=-570334;SVTYPE=CNV;END=147593064;REFLEN=570334;RIGHT_BND=DRAGEN:DEL:6916:0:1:0:0:0;OrigCnvEnd=147593080;SVCLAIM=DJ	GT:SM:CN:BC:GC:CT:AC:PE        0/1:0.506255:1:456:0.403456:0.498899:0.503649:3,12       0/1:0.513739:1:751:0.411265:0.499147:0.502668:1,1

To resolve this, I used the -m id flag instead, which only merge records if they have shared ID.

bcftools merge -m id one.vcf.gz two.vcf.gz | bgzip > merge.vcf.gz

Would it be possible to update the wiki to reflect this alternative command? I believe it could save future users some time and help ensure accurate results from truvari collapse.

Thanks again for your great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions