Tags: a-ludi/dentist
Tags
## [4.0.0] - 2022-09-02 ### Added - include GIT commit in logs - script to generate report on closed and unclosed gaps - script `mask2bed` that converts Dazzler masks to BED files - JSON schema for config file - debugging output to track down issue #31 ### Changed - preserve original scaffold headers in output FASTA - ensure unique scaffold IDs in `output` - output FASTA names/coords in AGP - parallelized alignment filters - fail early measure against bug in `libmaus2` ### Fixed - added missing Python installation to Singularity image - treat long FASTA lines graciously - fixed rule `validate_dentist_config` - fixed install instructions Snakemake profile - fixed bug with newer versions of Snakemake - include Python files in GIT repo - close open LAS file asap - fixed JSON conversion of `AlignmentChain` - workaround for Phobos v2.099.0 bug - fixed compiler error - treat compiler warning
## [3.0.0] - 2021-12-09 ### Added - Conda packages `dentist` and `dentist-core` - DENTIST's configuration may be in YAML format - print summary of all commands with `dentist --commands` - user may select the maximum alignment error rate - note on a known bug that prohibits using ` :: ` in FASTA headers - online API documentation - included the demo example into the main repo - included JQ in the container for easy inspections - minimal integration tests that cover the whole pipeline ### Changed - substantially extended code documentation - improved documentation of `read-coverage` and friends - improved error message if no pile ups have been found - using a fixed version for Containers to avoid caching issues - renamed workflow parameter `max_threads` → `threads_per_process` - keep assertions in production code - allow empty LAS files for masking - improved pre-push hook to reduce accidental errors ### Removed - Docker container; now building directly Singularity image - outdated integration tests - deprecated and unused code - obsolete testing command `translocate-gaps` ### Fixed - improved compatibility of pre-compiled binaries by using Conda package - make alignments with more than 2^^32 local alignments work - minor compatibility fixes in the container - broken links in README - replaced defintion list by simple list in README
## [2.0.0] - 2021-06-21 ### Added - list of all commandline options - example for a greedy DENTIST configuration - guide on how to release a new version of DENTIST (work in progress) ### Changed - release v1.0.1 contained breaking changes so this release updates to v2.0.0: the changes to the workflow make it incompatible with old configuration files - moved Docker image to Ubuntu and reduced size - improved compatibility of pre-compiled binaries by compiling on Ubuntu 16.04 - sort read IDs in `insertions.db` to make AGP and BED files comparable - allow `--min-*-coverage` in `dentist mask-repetitive-regions` to be zero - avoid confusing message about pre-fetching the Singularity image if possible - updated README ### Removed - unused argument for `process-pile-ups` - replaced `Dockerfile.build-release` by regular `Dockerfile` ### Fixed - fixed `ProtectedOutputException` bug that was listed in the Troubleshooting section of the README - sort LAS files for daccord without chaining - buffer overflow in `propagate-mask` - adjust `read-coverage` in example configuration to actual coverage in the example dataset
## [1.0.2] - 2021-04-26 ### Added - provide pre-built binaries of DENTIST and all dependencies in release tarball - included unit tests in Docker build - github.io page ### Changed - Improved README a lot - Updated dependencies - Removed `LAcheck` from the workflow beacuse it is useless (see [issue 14](#14)) ### Fixed - Compiler error and deprecation warnings
## [1.0.1] - 2020-02-22 ### Added - A wonderful logo :-) ### Changed - Updated README and other docs - Some jobs in the workflow are grouped to reduce the number of cluster jobs - Workflow requires a minimum Snakemake version - Ignoring unused parameter in `process-pile-ups`; will be removed in next major release - Disentangled workflow configuration for better usability and less build time for Sakemake's DAG ### Removed - Old documentation parts/details ### Fixed - Sporadically lost masked regions in mask homogenization - Handling of cyclic scaffolds - Overly strict handling of types in DENTIST's config file - Several minor bugs
## [1.0.0] - 2020-02-04 ### Added - A Docker container! This means you can just `--use-singularity` with Snakemake. - Workflow rule to just produce all the repeat masks (this is used in the paper to calculate the repeat content of the assemblies) - Automatic validation of the closed gaps with an alignment of the reads against a preliminary gap-closed assembly: - Added command `bed2mask` - Optionally write a BED file of closed gaps - Added command `validate-regions` - Added interface for reading/writing Dazzler track extras which is utilized to communicate the contig and read IDs between `output` and `validate-regions` - Extensively documented the example workflow config `./snakemake/snakemake.yml` - Local alignment chaining via command `chain-local-alignments` and internally - Using chaining to filter/improve pile up alignments - Added possibility to revert CLI options via `--revert` - All multi-valued CLI options take their value from a comma-separated list and/or by giving the same option multiple times - Added `full_validation` flag to workflow to keep the preliminary assembly and validation results - Added `no_purge_output` flag to workflow to prevent the automatic skipping of invalid gaps; this also will not trigger the validation if not requested explicitly - Possibility to lazily read local alignments from `.las` file - Greatly improved performance of reading `.las` files by switching to binary interface - Possibility to manually skip filling of gaps - `DBdust` for improved sensitivity in alignments - Homogenized masks implemented via new command `propagate-mask` which translates a given mask via an alignment from one DB/DAM to another. The masks are propagated from the assembly to the reads and back to gain sensitivity.
## [1.0.0-beta.2] - 2020-07-23 ### Added - Allow use of environment variables in Snakemake workflow config - Avoid appending to DBs by design - Improved README: - Advice on how to choose parameters - Advice on how to run DENTIST with different read types - Version information to dependencies - Log level information to log messages - More logging on failed gap closing ### Changed - Simplified usage of `--workdir`: no need to manually create the designated directory - Improvements to close more gaps: - Custom pre-consensus alignment filtering - Add support sequence to cropped reads to ensure daligner finds alignments - Allow cropping in masked region if necessary - Selectively ignore repeat mask to allow post consensus alignments - Increased sensitivity in pileup alignments by adding the bridging option of `daligner` - Select reference read for consensus by intrinsic QVs → better consensus quality - Moved flag `--max-insertion-error` from `process` to `output` stage so trying different values becomes much faster - Automatically deduce trace point spacing in all places - Faster check if `.las` files are empty → faster CLI options checking - Naming of temporary files for easier inspection - Use `DBdust` for post consensus alignment - Produce `.db` for cropped pileups (temporary files) to make `DAScover` and `DASqv` work - Removed `-I` option from `daligner` calls (avoid useless alignment) ### Fixed - Several bugs in Snakemake workflow - Significantly improved number of closed gaps - Coordinates in AGP output - Bug in procedure that identifies a good cropping position - Error that caused `--proper-alignment-allowance` to have no effect by default
## [1.0.0-beta.1] - 2020-03-17 ### Added - post-consensus alignment and validation with new parameter `--max-insertion-error` - inserted sequences are highlighted by upper-case letters which can be turned off with `--no-highlight-insertions` - batch ranges may end with a `$` indicating the end of the pileup DB - some mechanisms for early error detection - write duplicate contig IDs to contig alignment cache for easier debugging - added support for complementary contig alignments in `check-results` - allow `.db` databases as reference - improved version reporting - updated README with additional instructions ### Changed - integrated Snakemake workflow into a single file and removed "testing" workflow - cropping and splicing of insertions: - existing sequence is completely retained - moved from `process-pile-ups` to `output` - binary format of insertions DBs (breaking change) to gain more freedom in later steps - splice sites are chosen based on the post-consensus alignments - ambiguities in the alignment of reads are now detected globally - weakly anchored alignments are discarded early in the filtering pipeline - the self- and read-alignment-based masks are now computed separately - coverage values may now be fractional - improved README by adhering to [Standard Readme][standard-readme] - better (error) reporting - temporary files have more informative names - many minor refactorings and extensions ### Removed - combined self- and read-alignment-based masking: old behvaiour can be copied by using the `--mask` parameter and supplying both masks to all commands ### Fixed - trying all possible reference reads for consensus in order to find a non-failing reference - corrected insertion splicing in case of reverse-complement alignment of the consensus - bug that caused `check-results` to discard all alignments in certain loci - added missing logic for cropped contigs in `getGapState` in `check-results` [standard-readme]: https://github.com/RichardLitt/standard-readme
PreviousNext