-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathCHANGELOG
153 lines (134 loc) · 6.34 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
CHANGELOG
v2.1.5:
Bugfix/new feature?:
Make cutadapt multicore
v2.1.4:
Bugfix:
Fix Gene sums bug in MutationFreqFromVCF.py
v2.1.3:
Bugfix:
Add additional checks to the 'sample' column of the config file, to exclude characters that aren't allowed in file names.
v2.1.2:
Bugfix:
Re-add pre variant calling checkpoint to fix non-functionality of rerun mode 1.
Remove "ancient" keyword and snakemake version restriction to fix run problems on macOS X
Fix issue in VCF generation where the same variant could appear multiple times
Internal Changes:
Instead of removing families containing Ns or mononucleotide repeats in UMIs after all consensus making happened, do it before any consensus making hapens. The tagstats output removes unchanged at this time; the only difference will be in the SSCS output, which will not contain reads with UMIs that contain Ns or mononucleotide repeats (it did previously).
Track number of N-containing UMIs in CM_stats file.
Modify rerun to check for file existance (depends on OS module from Python standard library; may not work identically on all systems).
v2.1.1:
Bugfix:
Fix issue where masking in countmuts file was not applied correctly
v2.1.0:
Bugfix:
Fix spelling on some rule names
New Features:
Populate log files, add missing log files, and normalize log file names
Add '--keep-going' to snakemake command in run script, which allows for
the pipeline to continue to run independant steps if one step fails;
rerun setup script to take advantage of this.
v2.0.1:
Bugfix:
Fix a bug where providing a masking bed caused the pipeline to crash at DepthSummaryCsv.py
v2.0.0:
Bugfixes:
Add definition of negative taxIDs to report.
Fix bed blocks issue where terminal comma would cause crashes
Fix table formating for countmuts and depth tables in report
Change clustering to avoid using SNPs for clustering analysis.
explicitly convert report to HTML
Fix fastq output third line from consensusMaker
Set quality of N bases from consensusMaker to 0
Fix crash on non-CATG bases in the reference genome
New Features:
Change recovery script format
Create new test data set that can better deomonstrate the BLAST filter
Create new test reference / blastDB to match new test dataset
Add extra tests to the test config file
Make retrieveSummary.py work from the whole-pipeline config file
Reorganize BLAST control to make running without BLAST more explicit.
Add unlock script to setup script
Implement new depth script
Implement VarDict
Move the PostBlastRecovery to its own environment, allowing custom user programs without affecting the base environment.
Create script to summarize depth based on a provided bed file
Add the ability to select which filters the mutation frequency program applies.
Add % mapped raw read to the summary CSV
Add RawOnTarget to summaryCSV
Add masking functionality
Change countMutsPerCycle to allow filtering out mutatios based on VCF filters, and to allow for filtering of near indel variants. Also allows for an "include" mode that includes only variants in the VCF file. Modify Snakefile to allow this method to draw from the countmuts filtering parameters.
Add adapter clipping
Add Mamba frontend to setup script.
Add readout for % on target SSCS and DCS to summaryCSV, report.
Internal Changes:
Add gitignore rule to ignore user-generated recovery scripts
Remove chrM_recovery, which is a custom recovery script from our lab
Remove testConfig.csv, since it is created by the setup script
Remove GATK3 from setup and Snakefile
Add vardict-java to environment
Modify MakeDepthPlot.R and retrieveSummary.py to point to new files.
Add VarDict-based Muts by Read Position program (not used)
Give final read length to MutsPerCycle, instead of initial read length.
Remove extra envirmonet setup rules
Move BedParser to seperate file
Added pre-variant calling BAM filter to BED coordinates
Make prevar file temporary.
Add error checking to enforce number of blockStarts and blockSizes
add str and repr methods for Bed_Line
Add Bed_Writer functionality
Add DepthSummaryCsv to Snakefile
Add filter definitions to mutation frequency output and report
Verify that a variant is consists entirely of ACGTN bases
Change r versioning in DS_env_full
Add a bed buffering step pre-vardict
Add bedtools to run environment
Change BLAST database setup and application
v1.1.6:
January 12, 2021
Bugfix:
Fix non-working non-unique mode for countmuts files
v1.1.5:
November 20, 2020
Bugfix:
Fix bug on line 375 with 0-position starts in bed files
v1.1.4:
September 24, 2020
Bugfixes:
Explicitly convert report to html for compatibility with nbconvert 6.0.0+
v1.1.3:
Crash on 'N' ref bases in muts_per_cycle
v1.1.4
Crash on 'N' ref bases in muts_per_cycle
Explicitly convert report to html for compatibility with nbconvert 6.0.0+
v1.1.2:
July 29, 2020
Fixes a few bugs:
Misnamed defaults for maxClonal, maxClonal,
Misnamed error checker for rgpl
Fixed symlinking in clipBam step when no clipping is requested. (replace with copying)
v1.1.1:
May 14, 2020
Bugfix:
SNPs VCF file wasn't being preserved. This fixes that issue.
v1.1.0:
May 5, 2020
Bugfixes:
Add testConfig.csv file, which was accidentally omitted in the 1.0.0 release
Fix bug with depth plots where zero-depth positions could accidentally be labeled as having non-zero depth
Fix crashes from running samples that produce no DCS data
Change the default recovery script to avoid symlinks
New Features:
Additional output options for the bamToCountmuts program
Allow summing total and genes by gene or by block
Allow outputting all, overall + genes, overall + blocks, or just overall
Add VERSION file, and specify pipeline version in the report output.
Get mutation counts from the VCF file, instead of directly from the BAM file
Add ability to set maximum number of cores to use during setup.
Internal changes:
Simplify bed file column naming in depth plotting script
Separate mutation analysis steps into different snakemake rules
General code cleanup
V1.0.0:
March 31, 2020
New release of the Duplex Sequencing Bioinformatics Pipeline, based on Snakemake instead of Bash.