Skip to content

Evaluate other assemblers #665

@donkirkby

Description

@donkirkby

We've had reasonably good results with IVA, but it has a few problems:

  1. Some samples are very slow, even taking multiple days. Random primers seem to be slow.
  2. We find contigs with a lot of repeats, possibly caused by primer dimers. That may or may not be a problem with the assembler.
  3. The IVA project is not really maintained anymore, so it may be wiser to find another tool.

A brief search for alternatives turned up SPAdes, as mentioned in an article by Sutton, as well as tadpole, mentioned in a discussion forum.

An interesting sample for experimenting on is D62201-HCV_S3 from the 26 Feb 2016.M04401 run. It looks like a mixed HCV infection with two or possibly three strains to assemble.

Edit: To Do List

  • Investigate where IVA is struggling.
  • Try Spades.
  • Try Abyss.
  • Use ntJoin to scaffold the contigs to a reference genome.
  • Create a pessimistic mode for IVA for a quick attempt at assembly.
  • Improve IVA's pessimistic mode by making it use the filtered reads in each step.
  • Update to latest IVA fixes.
  • Try Velvet.
  • Optimise Velvet's input parameters / automate the determination of the parameters for each sample.
  • Try Haploflow.
  • Use contig merging and scaffolding to improve Haploflow's assembly results.
  • Improve RP results for Haploflow.
  • Compare IVA and Haploflow quantitatively.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions