Skip to content

Commit 4782f89

Browse files
committed
Fix samtools#963, update documentation.
1 parent f1ea770 commit 4782f89

File tree

9 files changed

+76
-18
lines changed

9 files changed

+76
-18
lines changed

doc/bcftools.1

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
.\" Title: bcftools
33
.\" Author: [see the "AUTHORS" section]
44
.\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/>
5-
.\" Date: 2019-02-24 15:39 GMT
5+
.\" Date: 2019-03-10 16:00 GMT
66
.\" Manual: \ \&
77
.\" Source: \ \&
88
.\" Language: English
99
.\"
10-
.TH "BCFTOOLS" "1" "2019\-02\-24 15:39 GMT" "\ \&" "\ \&"
10+
.TH "BCFTOOLS" "1" "2019\-03\-10 16:00 GMT" "\ \&" "\ \&"
1111
.\" -----------------------------------------------------------------
1212
.\" * Define some portability stuff
1313
.\" -----------------------------------------------------------------
@@ -41,7 +41,7 @@ Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatica
4141
BCFtools is designed to work on a stream\&. It regards an input file "\-" as the standard input (stdin) and outputs to the standard output (stdout)\&. Several commands can thus be combined with Unix pipes\&.
4242
.SS "VERSION"
4343
.sp
44-
This manual page was last updated \fB2019\-02\-24 15:39 GMT\fR and refers to bcftools git version \fB1\&.9\-104\-g22f4a3a+\fR\&.
44+
This manual page was last updated \fB2019\-03\-10 16:00 GMT\fR and refers to bcftools git version \fB1\&.9\-111\-gf1ea770+\fR\&.
4545
.SS "BCF1"
4646
.sp
4747
The BCF1 format output by versions of samtools <= 0\&.1\&.19 is \fBnot\fR compatible with this version of bcftools\&. To read BCF1 files one can use the view command from old versions of bcftools packaged with samtools versions <= 0\&.1\&.19 to convert to VCF, which can then be read by this version of bcftools\&.
@@ -967,8 +967,8 @@ take advantage of prior knowledge of population allele frequencies\&. The workfl
967967
# Extract AN,AC values from an existing VCF, such 1000Genomes
968968
bcftools query \-f\*(Aq%CHROM\et%POS\et%REF\et%ALT\et%AN\et%AC\en\*(Aq 1000Genomes\&.bcf | bgzip \-c > AFs\&.tab\&.gz
969969

970-
# If the tags AN,AC are not already present, use the +fill\-AN\-AC plugin
971-
bcftools +fill\-AN\-AC 1000Genomes\&.bcf | bcftools query \-f\*(Aq%CHROM\et%POS\et%REF\et%ALT\et%AN\et%AC\en\*(Aq | bgzip \-c > AFs\&.tab\&.gz
970+
# If the tags AN,AC are not already present, use the +fill\-tags plugin
971+
bcftools +fill\-tags 1000Genomes\&.bcf | bcftools query \-f\*(Aq%CHROM\et%POS\et%REF\et%ALT\et%AN\et%AC\en\*(Aq | bgzip \-c > AFs\&.tab\&.gz
972972
tabix \-s1 \-b2 \-e2 AFs\&.tab\&.gz
973973

974974
# Create a VCF header description, here we name the tags REF_AN,REF_AC
@@ -3420,6 +3420,11 @@ sets missing genotypes ("\&./\&.") to ref allele ("0/0" or "0|0")
34203420
prune sites by missingness or linkage disequilibrium
34213421
.RE
34223422
.PP
3423+
\fBremove\-overlaps\fR
3424+
.RS 4
3425+
remove overlapping variants and duplicate sites
3426+
.RE
3427+
.PP
34233428
\fBsetGT\fR
34243429
.RS 4
34253430
general tool to set genotypes according to rules requested by the user
@@ -3430,6 +3435,11 @@ general tool to set genotypes according to rules requested by the user
34303435
calculates basic per\-sample stats
34313436
.RE
34323437
.PP
3438+
\fBsplit\fR
3439+
.RS 4
3440+
split VCF by sample, creating single\-sample VCFs
3441+
.RE
3442+
.PP
34333443
\fBtag2tag\fR
34343444
.RS 4
34353445
convert between similar tags, such as GL and GP

doc/bcftools.html

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
<?xml version="1.0" encoding="UTF-8"?>
22
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3-
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>bcftools</title><link rel="stylesheet" type="text/css" href="docbook-xsl.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.76.1" /></head><body><div xml:lang="en" class="refentry" title="bcftools" lang="en"><a id="idp25137360"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>bcftools — utilities for variant calling and manipulating VCFs and BCFs.</p></div><div class="refsynopsisdiv" title="Synopsis"><a id="_synopsis"></a><h2>Synopsis</h2><p><span class="strong"><strong>bcftools</strong></span> [--version|--version-only] [--help] [<span class="emphasis"><em>COMMAND</em></span>] [<span class="emphasis"><em>OPTIONS</em></span>]</p></div><div class="refsect1" title="DESCRIPTION"><a id="_description"></a><h2>DESCRIPTION</h2><p>BCFtools is a set of utilities that manipulate variant calls in the Variant
3+
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>bcftools</title><link rel="stylesheet" type="text/css" href="docbook-xsl.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.76.1" /></head><body><div xml:lang="en" class="refentry" title="bcftools" lang="en"><a id="idp25135776"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>bcftools — utilities for variant calling and manipulating VCFs and BCFs.</p></div><div class="refsynopsisdiv" title="Synopsis"><a id="_synopsis"></a><h2>Synopsis</h2><p><span class="strong"><strong>bcftools</strong></span> [--version|--version-only] [--help] [<span class="emphasis"><em>COMMAND</em></span>] [<span class="emphasis"><em>OPTIONS</em></span>]</p></div><div class="refsect1" title="DESCRIPTION"><a id="_description"></a><h2>DESCRIPTION</h2><p>BCFtools is a set of utilities that manipulate variant calls in the Variant
44
Call Format (VCF) and its binary counterpart BCF. All commands work
55
transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.</p><p>Most commands accept VCF, bgzipped VCF and BCF with filetype detected
66
automatically even when streaming from a pipe. Indexed VCF and BCF
77
will work in all situations. Un-indexed VCF and BCF and streams will
88
work in most, but not all situations. In general, whenever multiple VCFs are
99
read simultaneously, they must be indexed and therefore also compressed.</p><p>BCFtools is designed to work on a stream. It regards an input file "-" as the
1010
standard input (stdin) and outputs to the standard output (stdout). Several
11-
commands can thus be combined with Unix pipes.</p><div class="refsect2" title="VERSION"><a id="_version"></a><h3>VERSION</h3><p>This manual page was last updated <span class="strong"><strong>2019-02-24 15:39 GMT</strong></span> and refers to bcftools git version <span class="strong"><strong>1.9-104-g22f4a3a+</strong></span>.</p></div><div class="refsect2" title="BCF1"><a id="_bcf1"></a><h3>BCF1</h3><p>The BCF1 format output by versions of samtools &lt;= 0.1.19 is <span class="strong"><strong>not</strong></span>
11+
commands can thus be combined with Unix pipes.</p><div class="refsect2" title="VERSION"><a id="_version"></a><h3>VERSION</h3><p>This manual page was last updated <span class="strong"><strong>2019-03-10 16:00 GMT</strong></span> and refers to bcftools git version <span class="strong"><strong>1.9-111-gf1ea770+</strong></span>.</p></div><div class="refsect2" title="BCF1"><a id="_bcf1"></a><h3>BCF1</h3><p>The BCF1 format output by versions of samtools &lt;= 0.1.19 is <span class="strong"><strong>not</strong></span>
1212
compatible with this version of bcftools. To read BCF1 files one can use
1313
the view command from old versions of bcftools packaged with samtools
1414
versions &lt;= 0.1.19 to convert to VCF, which can then be read by
@@ -478,8 +478,8 @@
478478
</dd></dl></div><pre class="screen"> # Extract AN,AC values from an existing VCF, such 1000Genomes
479479
bcftools query -f'%CHROM\t%POS\t%REF\t%ALT\t%AN\t%AC\n' 1000Genomes.bcf | bgzip -c &gt; AFs.tab.gz
480480

481-
# If the tags AN,AC are not already present, use the +fill-AN-AC plugin
482-
bcftools +fill-AN-AC 1000Genomes.bcf | bcftools query -f'%CHROM\t%POS\t%REF\t%ALT\t%AN\t%AC\n' | bgzip -c &gt; AFs.tab.gz
481+
# If the tags AN,AC are not already present, use the +fill-tags plugin
482+
bcftools +fill-tags 1000Genomes.bcf | bcftools query -f'%CHROM\t%POS\t%REF\t%ALT\t%AN\t%AC\n' | bgzip -c &gt; AFs.tab.gz
483483
tabix -s1 -b2 -e2 AFs.tab.gz
484484

485485
# Create a VCF header description, here we name the tags REF_AN,REF_AC
@@ -2075,6 +2075,10 @@
20752075
</span></dt><dd>
20762076
prune sites by missingness or linkage disequilibrium
20772077
</dd><dt><span class="term">
2078+
<span class="strong"><strong>remove-overlaps</strong></span>
2079+
</span></dt><dd>
2080+
remove overlapping variants and duplicate sites
2081+
</dd><dt><span class="term">
20782082
<span class="strong"><strong>setGT</strong></span>
20792083
</span></dt><dd>
20802084
general tool to set genotypes according to rules requested by the user
@@ -2083,6 +2087,10 @@
20832087
</span></dt><dd>
20842088
calculates basic per-sample stats
20852089
</dd><dt><span class="term">
2090+
<span class="strong"><strong>split</strong></span>
2091+
</span></dt><dd>
2092+
split VCF by sample, creating single-sample VCFs
2093+
</dd><dt><span class="term">
20862094
<span class="strong"><strong>tag2tag</strong></span>
20872095
</span></dt><dd>
20882096
convert between similar tags, such as GL and GP

doc/bcftools.txt

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -528,8 +528,8 @@ demand. The original calling model can be invoked with the *-c* option.
528528
# Extract AN,AC values from an existing VCF, such 1000Genomes
529529
bcftools query -f'%CHROM\t%POS\t%REF\t%ALT\t%AN\t%AC\n' 1000Genomes.bcf | bgzip -c > AFs.tab.gz
530530

531-
# If the tags AN,AC are not already present, use the +fill-AN-AC plugin
532-
bcftools +fill-AN-AC 1000Genomes.bcf | bcftools query -f'%CHROM\t%POS\t%REF\t%ALT\t%AN\t%AC\n' | bgzip -c > AFs.tab.gz
531+
# If the tags AN,AC are not already present, use the +fill-tags plugin
532+
bcftools +fill-tags 1000Genomes.bcf | bcftools query -f'%CHROM\t%POS\t%REF\t%ALT\t%AN\t%AC\n' | bgzip -c > AFs.tab.gz
533533
tabix -s1 -b2 -e2 AFs.tab.gz
534534

535535
# Create a VCF header description, here we name the tags REF_AN,REF_AC
@@ -2092,6 +2092,9 @@ By default, appropriate system directories are searched for installed plugins.
20922092
*smpl-stats*::
20932093
calculates basic per-sample stats
20942094

2095+
*split*::
2096+
split VCF by sample, creating single-sample VCFs
2097+
20952098
*tag2tag*::
20962099
convert between similar tags, such as GL and GP
20972100

plugins/fill-AN-AC.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ int *arr = NULL, marr = 0;
3333

3434
const char *about(void)
3535
{
36-
return "Fill INFO fields AN and AC.\n";
36+
return "Fill INFO fields AN and AC. This plugin is DEPRECATED, use fill-tags instead.\n";
3737
}
3838

3939
int init(int argc, char **argv, bcf_hdr_t *in, bcf_hdr_t *out)

plugins/fill-tags.c

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -509,8 +509,10 @@ bcf1_t *process(bcf1_t *rec)
509509
args->farr[j-1] += pop->counts[j].nhet + pop->counts[j].nhom + pop->counts[j].nhemi + pop->counts[j].nac;
510510
an = pop->counts[0].nhet + pop->counts[0].nhom + pop->counts[0].nhemi + pop->counts[0].nac;
511511
for (j=1; j<rec->n_allele; j++) an += args->farr[j-1];
512-
if ( !an ) continue;
513-
for (j=1; j<rec->n_allele; j++) args->farr[j-1] /= an;
512+
if ( an )
513+
for (j=1; j<rec->n_allele; j++) args->farr[j-1] /= an;
514+
else
515+
for (j=1; j<rec->n_allele; j++) bcf_float_set_missing(args->farr[j-1]);
514516
}
515517
if ( args->tags & SET_AF )
516518
{
@@ -521,9 +523,11 @@ bcf1_t *process(bcf1_t *rec)
521523
}
522524
if ( args->tags & SET_MAF )
523525
{
524-
if ( !an ) continue;
525-
for (j=1; j<rec->n_allele; j++)
526-
if ( args->farr[j-1] > 0.5 ) args->farr[j-1] = 1 - args->farr[j-1]; // todo: this is incorrect for multiallelic sites
526+
if ( an )
527+
{
528+
for (j=1; j<rec->n_allele; j++)
529+
if ( args->farr[j-1] > 0.5 ) args->farr[j-1] = 1 - args->farr[j-1]; // todo: this is incorrect for multiallelic sites
530+
}
527531
args->str.l = 0;
528532
ksprintf(&args->str, "MAF%s", args->pop[i].suffix);
529533
if ( bcf_update_info_float(args->out_hdr,rec,args->str.s,args->farr,rec->n_allele-1)!=0 )

test/fill-tags-AN0.out

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
##fileformat=VCFv4.2
2+
##FILTER=<ID=PASS,Description="All filters passed">
3+
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
4+
##contig=<ID=chr1,length=249250621,assembly=hg19>
5+
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
6+
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes">
7+
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of samples with data">
8+
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele frequency">
9+
##INFO=<ID=AC_Hom,Number=A,Type=Integer,Description="Allele counts in homozygous genotypes">
10+
##INFO=<ID=AC_Het,Number=A,Type=Integer,Description="Allele counts in heterozygous genotypes">
11+
##INFO=<ID=AC_Hemi,Number=A,Type=Integer,Description="Allele counts in hemizygous genotypes">
12+
##INFO=<ID=MAF,Number=A,Type=Float,Description="Minor Allele frequency">
13+
##INFO=<ID=HWE,Number=A,Type=Float,Description="HWE test (PMID:15789306); 1=good, 0=bad">
14+
##INFO=<ID=ExcHet,Number=A,Type=Float,Description="Test excess heterozygosity; 1=good, 0=bad">
15+
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMP_A SAMP_B SAMP_C SAMP_E
16+
chr1 10146 . AC A . PASS NS=2;AN=4;AF=0.5;AC=2;MAF=0.5;AC_Het=2;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=0.666667 GT ./. 0/1 ./. 0/1
17+
chr1 10153 . A C . PASS NS=0;AN=0;AF=.;AC=0;MAF=.;AC_Het=0;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1 GT ./. ./. ./. ./.
18+
chr1 10154 . C G . PASS NS=0;AN=0;AF=.;AC=0;MAF=.;AC_Het=0;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1 GT ./. ./. ./. ./.
19+
chr1 10172 . C A . PASS NS=3;AN=6;AF=0;AC=0;MAF=0;AC_Het=0;AC_Hom=0;AC_Hemi=0;HWE=1;ExcHet=1 GT 0/0 0/0 0/0 ./.

test/fill-tags-AN0.vcf

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
##fileformat=VCFv4.2
2+
##FILTER=<ID=PASS,Description="All filters passed">
3+
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
4+
##contig=<ID=chr1,length=249250621,assembly=hg19>
5+
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
6+
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes">
7+
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of samples with data">
8+
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele frequency">
9+
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMP_A SAMP_B SAMP_C SAMP_E
10+
chr1 10146 . AC A . PASS NS=3;AN=6;AF=0.5;AC=3 GT ./. 0/1 ./. 0/1
11+
chr1 10153 . A C . PASS NS=1;AN=2;AF=0.5;AC=2 GT ./. ./. ./. ./.
12+
chr1 10154 . C G . PASS NS=2;AN=4;AF=0.25;AC=1 GT ./. ./. ./. ./.
13+
chr1 10172 . C A . PASS NS=3;AN=6;AF=0;AC=0 GT 0/0 0/0 0/0 ./.

test/fill-tags.2.out

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
##INFO=<ID=MAF,Number=A,Type=Float,Description="Minor Allele frequency">
3535
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003
3636
11 2343543 . A . 999 PASS DP=100223;NS=3;AN=6 GT:PL:DP:GQ 0/0:0:193:99 0/0:0:211:99 0/0:0:182:99
37-
11 5464562 . C T 999 PASS DP=0;NS=0;AN=0;AC=0 GT:PL:DP:GQ ./.:0,0,0:.:. ./.:0,0,0:.:. ./.:0,0,0:.:.
37+
11 5464562 . C T 999 PASS DP=0;NS=0;AN=0;AF=.;MAF=.;AC=0 GT:PL:DP:GQ ./.:0,0,0:.:. ./.:0,0,0:.:. ./.:0,0,0:.:.
3838
20 76962 rs6111385 T C 999 PASS DP4=110138,70822,421911,262673;DP=911531;Dels=0;FS=21.447;HWE=0.491006;ICF=-0.01062;MQ0=1;MQ=46;PV4=2.5e-09,0,0,1;QD=22.31;NS=3;AN=6;AF=0.833333;MAF=0.166667;AC=5 GT:PL:DP:GQ 0/1:255,0,255:193:99 1/1:255,255,0:211:99 1/1:255,255,0:182:99
3939
20 126310 . ACC A 999 StrandBias;EndDistBias DP4=125718,95950,113812,80890;DP=461867;HWE=0.24036;ICF=0.01738;INDEL;IS=374,0.937343;MQ=49;PV4=9e-30,1,0,3.8e-13;QD=0.0172;AN=6;AC=4;NS=3;AF=0.666667;MAF=0.333333 GT:DP:GQ:PL 0/1:117:99:255,0,132 0/1:111:99:255,0,139 1/1:78:99:255,213,0
4040
20 138125 rs2298108 G T 999 PASS DP4=174391,20849,82080,4950;DP=286107;Dels=0;FS=3200;HWE=0.199462;ICF=0.01858;MQ0=0;MQ=46;PV4=0,0,0,1;QD=17.22;AN=6;AC=4;NS=3;AF=0.666667;MAF=0.333333 GT:PL:DP:GQ 0/1:135,0,163:66:99 0/1:140,0,255:71:99 1/1:255,199,0:66:99

test/test.pl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,7 @@
318318
test_vcf_plugin($opts,in=>'fill-tags-hemi',out=>'fill-tags-hemi.1.out',cmd=>'+fill-tags --no-version');
319319
test_vcf_plugin($opts,in=>'fill-tags-hemi',out=>'fill-tags-hemi.2.out',cmd=>'+fill-tags --no-version',args=>'-- -d');
320320
test_vcf_plugin($opts,in=>'fill-tags-hwe',out=>'fill-tags-hwe.out',cmd=>'+fill-tags --no-version');
321+
test_vcf_plugin($opts,in=>'fill-tags-AN0',out=>'fill-tags-AN0.out',cmd=>'+fill-tags --no-version');
321322
test_vcf_plugin($opts,in=>'view',out=>'view.GTisec.out',cmd=>'+GTisec',args=>' | grep -v bcftools');
322323
test_vcf_plugin($opts,in=>'view',out=>'view.GTisec.H.out',cmd=>'+GTisec',args=>'-- -H | grep -v bcftools');
323324
test_vcf_plugin($opts,in=>'view',out=>'view.GTisec.Hm.out',cmd=>'+GTisec',args=>'-- -Hm | grep -v bcftools');

0 commit comments

Comments
 (0)