Skip to content

Tags: a-ludi/dentist

Tags

v4.0.0

Toggle v4.0.0's commit message
## [4.0.0] - 2022-09-02

### Added

- include GIT commit in logs
- script to generate report on closed and unclosed gaps
- script `mask2bed` that converts Dazzler masks to BED files
- JSON schema for config file
- debugging output to track down issue #31


### Changed

- preserve original scaffold headers in output FASTA
- ensure unique scaffold IDs in `output`
- output FASTA names/coords in AGP
- parallelized alignment filters
- fail early measure against bug in `libmaus2`

### Fixed

- added missing Python installation to Singularity image
- treat long FASTA lines graciously
- fixed rule `validate_dentist_config`
- fixed install instructions Snakemake profile
- fixed bug with newer versions of Snakemake
- include Python files in GIT repo
- close open LAS file asap
- fixed JSON conversion of `AlignmentChain`
- workaround for Phobos v2.099.0 bug
- fixed compiler error
- treat compiler warning

v3.0.0

Toggle v3.0.0's commit message
## [3.0.0] - 2021-12-09

### Added

- Conda packages `dentist` and `dentist-core`
- DENTIST's configuration may be in YAML format
- print summary of all commands with `dentist --commands`
- user may select the maximum alignment error rate
- note on a known bug that prohibits using ` :: ` in FASTA headers
- online API documentation
- included the demo example into the main repo
- included JQ in the container for easy inspections
- minimal integration tests that cover the whole pipeline

### Changed

- substantially extended code documentation
- improved documentation of `read-coverage` and friends
- improved error message if no pile ups have been found
- using a fixed version for Containers to avoid caching issues
- renamed workflow parameter `max_threads` → `threads_per_process`
- keep assertions in production code
- allow empty LAS files for masking
- improved pre-push hook to reduce accidental errors

### Removed

- Docker container; now building directly Singularity image
- outdated integration tests
- deprecated and unused code
- obsolete testing command `translocate-gaps`

### Fixed

- improved compatibility of pre-compiled binaries by using Conda package 
- make alignments with more than 2^^32 local alignments work
- minor compatibility fixes in the container
- broken links in README
- replaced defintion list by simple list in README

v2.0.0

Toggle v2.0.0's commit message
## [2.0.0] - 2021-06-21

### Added
- list of all commandline options
- example for a greedy DENTIST configuration
- guide on how to release a new version of DENTIST (work in progress)

### Changed
- release v1.0.1 contained breaking changes so this release updates to v2.0.0:
  the changes to the workflow make it incompatible with old configuration files
- moved Docker image to Ubuntu and reduced size
- improved compatibility of pre-compiled binaries by compiling on Ubuntu 16.04
- sort read IDs in `insertions.db` to make AGP and BED files comparable
- allow `--min-*-coverage` in `dentist mask-repetitive-regions` to be zero
- avoid confusing message about pre-fetching the Singularity image if possible
- updated README

### Removed
- unused argument for `process-pile-ups`
- replaced `Dockerfile.build-release` by regular `Dockerfile`

### Fixed
- fixed `ProtectedOutputException` bug that was listed in the Troubleshooting
  section of the README
- sort LAS files for daccord without chaining
- buffer overflow in `propagate-mask`
- adjust `read-coverage` in example configuration to actual coverage in the
  example dataset

v1.0.2

Toggle v1.0.2's commit message
## [1.0.2] - 2021-04-26

### Added
- provide pre-built binaries of DENTIST and all dependencies in release tarball
- included unit tests in Docker build
- github.io page

### Changed
- Improved README a lot
- Updated dependencies
- Removed `LAcheck` from the workflow beacuse it is useless
  (see [issue 14](#14))

### Fixed
- Compiler error and deprecation warnings

v1.0.1

Toggle v1.0.1's commit message
## [1.0.1] - 2020-02-22

### Added
- A wonderful logo :-)

### Changed
- Updated README and other docs
- Some jobs in the workflow are grouped to reduce the number of cluster jobs
- Workflow requires a minimum Snakemake version
- Ignoring unused parameter in `process-pile-ups`; will be removed in next
  major release
- Disentangled workflow configuration for better usability and less build time
  for Sakemake's DAG

### Removed
- Old documentation parts/details

### Fixed
- Sporadically lost masked regions in mask homogenization
- Handling of cyclic scaffolds
- Overly strict handling of types in DENTIST's config file
- Several minor bugs

v1.0.0

Toggle v1.0.0's commit message
## [1.0.0] - 2020-02-04

### Added

- A Docker container! This means you can just `--use-singularity` with
  Snakemake.
- Workflow rule to just produce all the repeat masks (this is used in the
  paper to calculate the repeat content of the assemblies)
- Automatic validation of the closed gaps with an alignment of the reads
  against a preliminary gap-closed assembly:
    - Added command `bed2mask`
    - Optionally write a BED file of closed gaps
    - Added command `validate-regions`
    - Added interface for reading/writing Dazzler track extras which is
      utilized to communicate the contig and read IDs between `output` and
      `validate-regions`
- Extensively documented the example workflow config `./snakemake/snakemake.yml`
- Local alignment chaining via command `chain-local-alignments` and internally
- Using chaining to filter/improve pile up alignments
- Added possibility to revert CLI options via `--revert`
- All multi-valued CLI options take their value from a comma-separated list
  and/or by giving the same option multiple times
- Added `full_validation` flag to workflow to keep the preliminary assembly
  and validation results
- Added `no_purge_output` flag to workflow to prevent the automatic skipping
  of invalid gaps; this also will not trigger the validation if not requested
  explicitly
- Possibility to lazily read local alignments from `.las` file
- Greatly improved performance of reading `.las` files by switching to binary
  interface
- Possibility to manually skip filling of gaps
- `DBdust` for improved sensitivity in alignments
- Homogenized masks implemented via new command `propagate-mask` which
  translates a given mask via an alignment from one DB/DAM to another. The
  masks are propagated from the assembly to the reads and back to gain
  sensitivity.

rm-damapper

Toggle rm-damapper's commit message
Replaced `damapper` by `daligner` + `dentist chain`

v1.0.0-beta.3

Toggle v1.0.0-beta.3's commit message
## [1.0.0-beta.3] - 2020-07-23

### Added
- Always skip file locking with environment variable `SKIP_FILE_LOCKING=1`

v1.0.0-beta.2

Toggle v1.0.0-beta.2's commit message
## [1.0.0-beta.2] - 2020-07-23

### Added
- Allow use of environment variables in Snakemake workflow config
- Avoid appending to DBs by design
- Improved README:
    - Advice on how to choose parameters
    - Advice on how to run DENTIST with different read types
    - Version information to dependencies
- Log level information to log messages
- More logging on failed gap closing

### Changed
- Simplified usage of `--workdir`: no need to manually create
  the designated directory
- Improvements to close more gaps:
    - Custom pre-consensus alignment filtering
    - Add support sequence to cropped reads to ensure daligner finds alignments
    - Allow cropping in masked region if necessary
    - Selectively ignore repeat mask to allow post consensus alignments
    - Increased sensitivity in pileup alignments by adding the bridging option
      of `daligner`
- Select reference read for consensus by intrinsic QVs → better
  consensus quality
- Moved flag `--max-insertion-error` from `process` to `output` stage so
  trying different values becomes much faster
- Automatically deduce trace point spacing in all places
- Faster check if `.las` files are empty → faster CLI options checking
- Naming of temporary files for easier inspection
- Use `DBdust` for post consensus alignment
- Produce `.db` for cropped pileups (temporary files) to make `DAScover`
  and `DASqv` work
- Removed `-I` option from `daligner` calls (avoid useless alignment)

### Fixed
- Several bugs in Snakemake workflow
- Significantly improved number of closed gaps
- Coordinates in AGP output
- Bug in procedure that identifies a good cropping position
- Error that caused `--proper-alignment-allowance` to have no effect by default

v1.0.0-beta.1

Toggle v1.0.0-beta.1's commit message
## [1.0.0-beta.1] - 2020-03-17

### Added
- post-consensus alignment and validation with new parameter
  `--max-insertion-error`
- inserted sequences are highlighted by upper-case letters which can be
  turned off with `--no-highlight-insertions`
- batch ranges may end with a `$` indicating the end of the pileup DB
- some mechanisms for early error detection
- write duplicate contig IDs to contig alignment cache for easier debugging
- added support for complementary contig alignments in `check-results`
- allow `.db` databases as reference
- improved version reporting
- updated README with additional instructions

### Changed
- integrated Snakemake workflow into a single file and removed "testing"
  workflow
- cropping and splicing of insertions:
    - existing sequence is completely retained
    - moved from `process-pile-ups` to `output`
    - binary format of insertions DBs (breaking change) to gain more freedom
      in later steps
    - splice sites are chosen based on the post-consensus alignments
- ambiguities in the alignment of reads are now detected globally
- weakly anchored alignments are discarded early in the filtering pipeline
- the self- and read-alignment-based masks are now computed separately
- coverage values may now be fractional
- improved README by adhering to [Standard Readme][standard-readme]
- better (error) reporting
- temporary files have more informative names
- many minor refactorings and extensions

### Removed
- combined self- and read-alignment-based masking: old behvaiour can be copied
  by using the `--mask` parameter and supplying both masks to all commands

### Fixed
- trying all possible reference reads for consensus in order to find a
  non-failing reference
- corrected insertion splicing in case of reverse-complement alignment of the
  consensus
- bug that caused `check-results` to discard all alignments in certain loci
- added missing logic for cropped contigs in `getGapState` in `check-results`

[standard-readme]: https://github.com/RichardLitt/standard-readme