Norm produces summary stats, eg:
Lines total/split/joined/realigned/skipped: 2/0/0/0/0
It would be nice if this included "removed duplicates" (or rmdup or whatever)
Eg sample file w/dupes:
##fileformat=VCFv4.0
##contig=<ID=1,length=249250621,assembly=hg19>
#CHROM POS ID REF ALT QUAL FILTER INFO
1 99167215 rs5776431 CA C 4675.23 PASS .
1 99167215 rs5776431 CA C 4675.23 PASS .
bcftools norm -f /data/annotation/fasta/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz --remove-duplicates dupes.vcf
Output:
Lines total/split/joined/realigned/skipped: 2/0/0/0/0
Suggested output:
Lines total/split/joined/realigned/skipped/removed_dupes: 2/0/0/0/0/1