CHANGELOG

CHANGELOG
v2.1.5:
Bugfix/new feature?:
    Make cutadapt multicore

v2.1.4:
Bugfix:
    Fix Gene sums bug in MutationFreqFromVCF.py

v2.1.3:
Bugfix: 
    Add additional checks to the 'sample' column of the config file, to exclude characters that aren't allowed in file names. 

v2.1.2: 
Bugfix:
    Re-add pre variant calling checkpoint to fix non-functionality of rerun mode 1.
    Remove "ancient" keyword and snakemake version restriction to fix run problems on macOS X
    Fix issue in VCF generation where the same variant could appear multiple times
    
Internal Changes:
    Instead of removing families containing Ns or mononucleotide repeats in UMIs after all consensus making happened, do it before any consensus making hapens.  The tagstats output removes unchanged at this time; the only difference will be in the SSCS output, which will not contain reads with UMIs that contain Ns or mononucleotide repeats (it did previously).  
    Track number of N-containing UMIs in CM_stats file.  
    Modify rerun to check for file existance (depends on OS module from Python standard library; may not work identically on all systems).

v2.1.1: 
Bugfix:
    Fix issue where masking in countmuts file was not applied correctly

v2.1.0:
Bugfix:
    Fix spelling on some rule names
New Features:
    Populate log files, add missing log files, and normalize log file names
    Add '--keep-going' to snakemake command in run script, which allows for 
        the pipeline to continue to run independant steps if one step fails; 
        rerun setup script to take advantage of this.  

v2.0.1:
Bugfix: 
Fix a bug where providing a masking bed caused the pipeline to crash at DepthSummaryCsv.py

v2.0.0:
Bugfixes:
    Add definition of negative taxIDs to report.
    Fix bed blocks issue where terminal comma would cause crashes
    Fix table formating for countmuts and depth tables in report
    Change clustering to avoid using SNPs for clustering analysis. 
    explicitly convert report to HTML 
    Fix fastq output third line from consensusMaker
    Set quality of N bases from consensusMaker to 0
    Fix crash on non-CATG bases in the reference genome

New Features:
    Change recovery script format
    Create new test data set that can better deomonstrate the BLAST filter
    Create new test reference / blastDB to match new test dataset
    Add extra tests to the test config file
    Make retrieveSummary.py work from the whole-pipeline config file 
    Reorganize BLAST control to make running without BLAST more explicit.
    Add unlock script to setup script
    Implement new depth script
    Implement VarDict
    Move the PostBlastRecovery to its own environment, allowing custom user programs without affecting the base environment.
    Create script to summarize depth based on a provided bed file
    Add the ability to select which filters the mutation frequency program applies.
    Add % mapped raw read to the summary CSV
    Add RawOnTarget to summaryCSV
    Add masking functionality 
    Change countMutsPerCycle to allow filtering out mutatios based on VCF filters, and to allow for filtering of near indel variants.  Also allows for an "include" mode that includes only variants in the VCF file.  Modify Snakefile to allow this method to draw from the countmuts filtering parameters.
    Add adapter clipping 
    Add Mamba frontend to setup script.
    Add readout for % on target SSCS and DCS to summaryCSV, report.

Internal Changes: 
    Add gitignore rule to ignore user-generated recovery scripts
    Remove chrM_recovery, which is a custom recovery script from our lab
    Remove testConfig.csv, since it is created by the setup script
    Remove GATK3 from setup and Snakefile
    Add vardict-java to environment
    Modify MakeDepthPlot.R and retrieveSummary.py to point to new files.
    Add VarDict-based Muts by Read Position program (not used)
    Give final read length to MutsPerCycle, instead of initial read length.
    Remove extra envirmonet setup rules
    Move BedParser to seperate file
    Added pre-variant calling BAM filter to BED coordinates
    Make prevar file temporary.
    Add error checking to enforce number of blockStarts and blockSizes
    add str and repr methods for Bed_Line
    Add Bed_Writer functionality
    Add DepthSummaryCsv to Snakefile
    Add filter definitions to mutation frequency output and report
    Verify that a variant is consists entirely of ACGTN bases
    Change r versioning in DS_env_full
    Add a bed buffering step pre-vardict
    Add bedtools to run environment
    Change BLAST database setup and application

v1.1.6: 
January 12, 2021
Bugfix: 
    Fix non-working non-unique mode for countmuts files

v1.1.5: 
November 20, 2020
Bugfix:
    Fix bug on line 375 with 0-position starts in bed files

v1.1.4:
September 24, 2020
Bugfixes: 
    Explicitly convert report to html for compatibility with nbconvert 6.0.0+

v1.1.3:
    Crash on 'N' ref bases in muts_per_cycle

v1.1.4
    Crash on 'N' ref bases in muts_per_cycle
    Explicitly convert report to html for compatibility with nbconvert 6.0.0+

v1.1.2:
July 29, 2020
Fixes a few bugs:
    Misnamed defaults for maxClonal, maxClonal,
    Misnamed error checker for rgpl
    Fixed symlinking in clipBam step when no clipping is requested. (replace with copying)

v1.1.1:
May 14, 2020
Bugfix:
    SNPs VCF file wasn't being preserved. This fixes that issue.

v1.1.0: 
May 5, 2020
Bugfixes:
    Add testConfig.csv file, which was accidentally omitted in the 1.0.0 release
    Fix bug with depth plots where zero-depth positions could accidentally be labeled as having non-zero depth
    Fix crashes from running samples that produce no DCS data
    Change the default recovery script to avoid symlinks
New Features: 
    Additional output options for the bamToCountmuts program
        Allow summing total and genes by gene or by block
        Allow outputting all, overall + genes, overall + blocks, or just overall
    Add VERSION file, and specify pipeline version in the report output.
    Get mutation counts from the VCF file, instead of directly from the BAM file
    Add ability to set maximum number of cores to use during setup.
Internal changes:
    Simplify bed file column naming in depth plotting script
    Separate mutation analysis steps into different snakemake rules
    General code cleanup
    
V1.0.0:
March 31, 2020
New release of the Duplex Sequencing Bioinformatics Pipeline, based on Snakemake instead of Bash.