-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathout.txt
312 lines (306 loc) · 33.4 KB
/
out.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
------------------------------------------------------------------------------------
The Genome Analysis Toolkit (GATK) v3.8-1-0-gf15c1c3ef, Compiled 2018/02/19 05:43:50
Copyright (c) 2010-2016 The Broad Institute
For support and documentation go to https://software.broadinstitute.org/gatk
[Fri Oct 08 13:34:56 BST 2021] Executing on Linux 3.10.0-693.5.2.el7.x86_64 amd64
OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12
------------------------------------------------------------------------------------
------------------------------------------------------------------------------------
usage: java -jar GenomeAnalysisTK.jar -T <analysis_type> [-args <arg_file>] [-I <input_file>] [--showFullBamList] [-rbs
<read_buffer_size>] [-rf <read_filter>] [-drf <disable_read_filter>] [-L <intervals>] [-XL <excludeIntervals>] [-isr
<interval_set_rule>] [-im <interval_merging>] [-ip <interval_padding>] [-R <reference_sequence>] [-ndrs] [-maxRuntime
<maxRuntime>] [-maxRuntimeUnits <maxRuntimeUnits>] [-dt <downsampling_type>] [-dfrac <downsample_to_fraction>] [-dcov
<downsample_to_coverage>] [-baq <baq>] [-baqGOP <baqGapOpenPenalty>] [-fixNDN] [-fixMisencodedQuals]
[-allowPotentiallyMisencodedQuals] [-OQ] [-DBQ <defaultBaseQualities>] [-PF <performanceLog>] [-BQSR <BQSR>] [-qq
<quantize_quals>] [-SQQ <static_quantized_quals>] [-DIQ] [-EOQ] [-preserveQ <preserve_qscores_less_than>]
[-globalQScorePrior <globalQScorePrior>] [-secondsBetweenProgressUpdates <secondsBetweenProgressUpdates>] [-S
<validation_strictness>] [-rpr] [-kpr] [-sample_rename_mapping_file <sample_rename_mapping_file>] [-U <unsafe>]
[-jdk_deflater] [-jdk_inflater] [-disable_auto_index_creation_and_locking_when_reading_rods] [-no_cmdline_in_header]
[-sites_only] [-writeFullFormat] [-compress <bam_compression>] [-simplifyBAM] [--disable_bam_indexing] [--generate_md5]
[-nt <num_threads>] [-nct <num_cpu_threads_per_data_thread>] [-mte] [-rgbl <read_group_black_list>] [-ped <pedigree>]
[-pedString <pedigreeString>] [-pedValidationType <pedigreeValidationType>] [-variant_index_type <variant_index_type>]
[-variant_index_parameter <variant_index_parameter>] [-ref_win_stop <reference_window_stop>] [-l <logging_level>] [-log
<log_to_file>] [-h] [-version] [-filterRNC] [-filterMBQ] [-filterNoBases] [-glm <genotype_likelihoods_model>]
[-pcr_error <pcr_error_rate>] [-slod] [-pairHMM <pair_hmm_implementation>] [-mbq <min_base_quality_score>] [-deletions
<max_deletion_fraction>] [-minIndelCnt <min_indel_count_for_genotyping>] [-minIndelFrac <min_indel_fraction_per_sample>]
[-indelGCP <indelGapContinuationPenalty>] [-indelGOP <indelGapOpenPenalty>] [-nda] [-newQual] [-hets <heterozygosity>]
[-indelHeterozygosity <indel_heterozygosity>] [-heterozygosityStandardDeviation <heterozygosity_stdev>]
[-stand_call_conf <standard_min_confidence_threshold_for_calling>] [-maxAltAlleles <max_alternate_alleles>] [-maxGT
<max_genotype_count>] [-maxNumPLValues <max_num_PL_values>] [-inputPrior <input_prior>] [-ploidy <sample_ploidy>]
[-gt_mode <genotyping_mode>] [-alleles <alleles>] [-contamination <contamination_fraction_to_filter>]
[-contaminationFile <contamination_fraction_per_sample_file>] [-out_mode <output_mode>] [-allSitePLs] [-D <dbsnp>]
[-comp <comp>] [-o <out>] [-onlyEmitSamples <onlyEmitSamples>] [-A <annotation>] [-XA <excludeAnnotation>] [-G <group>]
-T,--analysis_type <analysis_type> Name of the tool to run
-args,--arg_file <arg_file> Reads arguments from the
specified file
-I,--input_file <input_file> Input file containing sequence
data (BAM or CRAM)
--showFullBamList Emit list of input BAM/CRAM
files to log
-rbs,--read_buffer_size <read_buffer_size> Number of reads per SAM file
to buffer in memory
-rf,--read_filter <read_filter> Filters to apply to reads
before analysis
-drf,--disable_read_filter <disable_read_filter> Read filters to disable
-L,--intervals <intervals> One or more genomic intervals
over which to operate
-XL,--excludeIntervals <excludeIntervals> One or more genomic intervals
to exclude from processing
-isr,--interval_set_rule <interval_set_rule> Set merging approach to use
for combining interval inputs
(UNION|INTERSECTION)
-im,--interval_merging <interval_merging> Interval merging rule for
abutting intervals (ALL|
OVERLAPPING_ONLY)
-ip,--interval_padding <interval_padding> Amount of padding (in bp) to
add to each interval
-R,--reference_sequence <reference_sequence> Reference sequence file
-ndrs,--nonDeterministicRandomSeed Use a non-deterministic random
seed
-maxRuntime,--maxRuntime <maxRuntime> Stop execution cleanly as soon
as maxRuntime has been reached
-maxRuntimeUnits,--maxRuntimeUnits <maxRuntimeUnits> Unit of time used by
maxRuntime (NANOSECONDS|
MICROSECONDS|MILLISECONDS|
SECONDS|MINUTES|HOURS|DAYS)
-dt,--downsampling_type <downsampling_type> Type of read downsampling to
employ at a given locus (NONE|
ALL_READS|BY_SAMPLE)
-dfrac,--downsample_to_fraction <downsample_to_fraction> Fraction of reads to
downsample to
-dcov,--downsample_to_coverage <downsample_to_coverage> Target coverage threshold for
downsampling to coverage
-baq,--baq <baq> Type of BAQ calculation to
apply in the engine (OFF|
CALCULATE_AS_NECESSARY|
RECALCULATE)
-baqGOP,--baqGapOpenPenalty <baqGapOpenPenalty> BAQ gap open penalty
-fixNDN,--refactor_NDN_cigar_string Reduce NDN elements in CIGAR
string
-fixMisencodedQuals,--fix_misencoded_quality_scores Fix mis-encoded base quality
scores
-allowPotentiallyMisencodedQuals,--allow_potentially_misencoded_quality_scores Ignore warnings about base
quality score encoding
-OQ,--useOriginalQualities Use the base quality scores
from the OQ tag
-DBQ,--defaultBaseQualities <defaultBaseQualities> Assign a default base quality
-PF,--performanceLog <performanceLog> Write GATK runtime performance
log to this file
-BQSR,--BQSR <BQSR> Input covariates table file
for on-the-fly base quality
score recalibration
-qq,--quantize_quals <quantize_quals> Quantize quality scores to a
given number of levels (with
-BQSR)
-SQQ,--static_quantized_quals <static_quantized_quals> Use static quantized quality
scores to a given number of
levels (with -BQSR)
-DIQ,--disable_indel_quals Disable printing of base
insertion and deletion tags
(with -BQSR)
-EOQ,--emit_original_quals Emit the OQ tag with the
original base qualities (with
-BQSR)
-preserveQ,--preserve_qscores_less_than <preserve_qscores_less_than> Don't recalibrate bases with
quality scores less than this
threshold (with -BQSR)
-globalQScorePrior,--globalQScorePrior <globalQScorePrior> Global Qscore Bayesian prior
to use for BQSR
-secondsBetweenProgressUpdates,--secondsBetweenProgressUpdates Time interval for process
<secondsBetweenProgressUpdates> meter information output (in
seconds)
-S,--validation_strictness <validation_strictness> How strict should we be with
validation (STRICT|LENIENT|
SILENT)
-rpr,--remove_program_records Remove program records from
the SAM header
-kpr,--keep_program_records Keep program records in the
SAM header
-sample_rename_mapping_file,--sample_rename_mapping_file <sample_rename_mapping_file> Rename sample IDs on-the-fly
at runtime using the provided
mapping file
-U,--unsafe <unsafe> Enable unsafe operations:
nothing will be checked at
runtime (ALLOW_N_CIGAR_READS|
ALLOW_UNINDEXED_BAM|
ALLOW_UNSET_BAM_SORT_ORDER|
NO_READ_ORDER_VERIFICATION|
ALLOW_SEQ_DICT_INCOMPATIBILITY|
LENIENT_VCF_PROCESSING|ALL)
-jdk_deflater,--use_jdk_deflater Use the JDK Deflater instead
of the IntelDeflater for
writing BAMs
-jdk_inflater,--use_jdk_inflater Use the JDK Inflater instead
of the IntelInflater for
reading BAMs
d_locking_when_reading_rods,--disable_auto_index_creation_and_locking_when_reading_rods Disable both auto-generation
of index files and index file
locking
-no_cmdline_in_header,--no_cmdline_in_header Don't include the command line
in output VCF headers
-sites_only,--sites_only Output sites-only VCF
-writeFullFormat,--never_trim_vcf_format_field Always output all the records
in VCF FORMAT fields, even if
some are missing
-compress,--bam_compression <bam_compression> Compression level to use for
writing BAM files (0 - 9,
higher is more compressed)
-simplifyBAM,--simplifyBAM Strip down read content and
tags
--disable_bam_indexing Turn off on-the-fly creation
of indices for output BAM/CRAM
files
--generate_md5 Enable on-the-fly creation of
md5s for output BAM files.
-nt,--num_threads <num_threads> Number of data threads to
allocate to this analysis
-nct,--num_cpu_threads_per_data_thread <num_cpu_threads_per_data_thread> Number of CPU threads to
allocate per data thread
-mte,--monitorThreadEfficiency Enable threading efficiency
monitoring
-rgbl,--read_group_black_list <read_group_black_list> Exclude read groups based on
tags
-ped,--pedigree <pedigree> Pedigree files for samples
-pedString,--pedigreeString <pedigreeString> Pedigree string for samples
-pedValidationType,--pedigreeValidationType <pedigreeValidationType> Validation strictness for
pedigree (STRICT|SILENT)
-variant_index_type,--variant_index_type <variant_index_type> Type of IndexCreator to use
for VCF/BCF indices
(DYNAMIC_SEEK|DYNAMIC_SIZE|
LINEAR|INTERVAL)
-variant_index_parameter,--variant_index_parameter <variant_index_parameter> Parameter to pass to the
VCF/BCF IndexCreator
-ref_win_stop,--reference_window_stop <reference_window_stop> Reference window stop
-l,--logging_level <logging_level> Set the minimum level of
logging
-log,--log_to_file <log_to_file> Set the logging location
-h,--help Generate the help message
-version,--version Output version information
Arguments for MalformedReadFilter:
-filterRNC,--filter_reads_with_N_cigar Filter out reads with CIGAR containing the N operator, instead of
failing with an error
-filterMBQ,--filter_mismatching_base_and_quals Filter out reads with mismatching numbers of bases and base qualities,
instead of failing with an error
-filterNoBases,--filter_bases_not_stored Filter out reads with no stored bases (i.e. '*' where the sequence
should be), instead of failing with an error
Arguments for UnifiedGenotyper:
-glm,--genotype_likelihoods_model <genotype_likelihoods_model> Genotype likelihoods
calculation model to employ --
SNP is the default option,
while INDEL is also available
for calling indels and BOTH is
available for calling both
together (SNP|INDEL|
GENERALPLOIDYSNP|
GENERALPLOIDYINDEL|BOTH)
-pcr_error,--pcr_error_rate <pcr_error_rate> The PCR error rate to be used
for computing fragment-based
likelihoods
-slod,--computeSLOD If provided, we will calculate
the SLOD (SB annotation)
-pairHMM,--pair_hmm_implementation <pair_hmm_implementation> The PairHMM implementation to
use for -glm INDEL genotype
likelihood calculations (EXACT|
ORIGINAL|LOGLESS_CACHING|
VECTOR_LOGLESS_CACHING|
VECTOR_LOGLESS_CACHING_OMP|
LESS_CACHING_FPGA_EXPERIMENTAL|
FASTEST_AVAILABLE|
DEBUG_VECTOR_LOGLESS_CACHING|
ARRAY_LOGLESS)
-mbq,--min_base_quality_score <min_base_quality_score> Minimum base quality required
to consider a base for calling
-deletions,--max_deletion_fraction <max_deletion_fraction> Maximum fraction of reads with
deletions spanning this locus
for it to be callable
-minIndelCnt,--min_indel_count_for_genotyping <min_indel_count_for_genotyping> Minimum number of consensus
indels required to trigger
genotyping run
-minIndelFrac,--min_indel_fraction_per_sample <min_indel_fraction_per_sample> Minimum fraction of all reads
at a locus that must contain
an indel (of any allele) for
that sample to contribute to
the indel count for alleles
-indelGCP,--indelGapContinuationPenalty <indelGapContinuationPenalty> Indel gap continuation
penalty, as Phred-scaled
probability. I.e., 30 =>
10^-30/10
-indelGOP,--indelGapOpenPenalty <indelGapOpenPenalty> Indel gap open penalty, as
Phred-scaled probability.
I.e., 30 => 10^-30/10
-nda,--annotateNDA Annotate number of alleles
observed
-newQual,--useNewAFCalculator Use new AF model instead of
the so-called exact model
-hets,--heterozygosity <heterozygosity> Heterozygosity value used to
compute prior likelihoods for
any locus
-indelHeterozygosity,--indel_heterozygosity <indel_heterozygosity> Heterozygosity for indel
calling
-heterozygosityStandardDeviation,--heterozygosity_stdev <heterozygosity_stdev> Standard deviation of
eterozygosity for SNP and
indel calling.
-stand_call_conf,--standard_min_confidence_threshold_for_calling The minimum phred-scaled
<standard_min_confidence_threshold_for_calling> confidence threshold at which
variants should be called
-maxAltAlleles,--max_alternate_alleles <max_alternate_alleles> Maximum number of alternate
alleles to genotype
-maxGT,--max_genotype_count <max_genotype_count> Maximum number of genotypes to
consider at any site
-maxNumPLValues,--max_num_PL_values <max_num_PL_values> Maximum number of PL values to
output
-inputPrior,--input_prior <input_prior> Input prior for calls
-ploidy,--sample_ploidy <sample_ploidy> Ploidy per sample. For pooled
data, set to (Number of
samples in each pool * Sample
Ploidy).
-gt_mode,--genotyping_mode <genotyping_mode> Specifies how to determine the
alternate alleles to use for
genotyping (DISCOVERY|
GENOTYPE_GIVEN_ALLELES)
-alleles,--alleles <alleles> Set of alleles to use in
genotyping
-contamination,--contamination_fraction_to_filter <contamination_fraction_to_filter> Fraction of contamination to
aggressively remove
-contaminationFile,--contamination_fraction_per_sample_file Contamination per sample
<contamination_fraction_per_sample_file>
-out_mode,--output_mode <output_mode> Which type of calls we should
output (EMIT_VARIANTS_ONLY|
EMIT_ALL_CONFIDENT_SITES|
EMIT_ALL_SITES)
-allSitePLs,--allSitePLs Annotate all sites with PLs
-D,--dbsnp <dbsnp> dbSNP file
-comp,--comp <comp> Comparison VCF file
-o,--out <out> File to which variants should
be written
-onlyEmitSamples,--onlyEmitSamples <onlyEmitSamples> If provided, only these
samples will be emitted into
the VCF, regardless of which
samples are present in the BAM
file
-A,--annotation <annotation> One or more specific
annotations to apply to
variant calls
-XA,--excludeAnnotation <excludeAnnotation> One or more specific
annotations to exclude
-G,--group <group> One or more classes/groups of
annotations to apply to
variant calls. The single
value 'none' removes the
default group
Available Reference Ordered Data types:
Name FeatureType Documentation
BCF2 VariantContext (this is an external codec and is not documented within GATK)
BEAGLE BeagleFeature (this is an external codec and is not documented within GATK)
BED BEDFeature (this is an external codec and is not documented within GATK)
BEDTABLE TableFeature (this is an external codec and is not documented within GATK)
EXAMPLEBINARY Feature (this is an external codec and is not documented within GATK)
RAWHAPMAP RawHapMapFeature (this is an external codec and is not documented within GATK)
REFSEQ RefSeqFeature (this is an external codec and is not documented within GATK)
SAMPILEUP SAMPileupFeature (this is an external codec and is not documented within GATK)
SAMREAD SAMReadFeature (this is an external codec and is not documented within GATK)
TABLE TableFeature (this is an external codec and is not documented within GATK)
VCF VariantContext (this is an external codec and is not documented within GATK)
VCF3 VariantContext (this is an external codec and is not documented within GATK)
For a full description of this walker, see its GATKdocs at:
https://software.broadinstitute.org/gatk/documentation/tooldocs/org_broadinstitute_gatk_tools_walkers_genotyper_UnifiedGenotyper.php