Skip to content

Improved algorithm for site-mode divergence matrix #2779

Closed
@jeromekelleher

Description

@jeromekelleher

The divergence matrix code merged in #2736 gives a reasonably efficient way to compute the pairwise branch-mode divergence among a set of samples, but it's very poor for site mode. This is probably because we're not using the fast MRCA calculations in the internal loop.

The simplest way to do this may be to keep a count of the total number of mutations that are ancestral to each node as we to along the genome, and then we should be able to use the same framework as the branch stat calculation.

This will be complicated by keeping track of windows also though.

Doing this will need an incremental algorithm that can start and stop at a given point efficiently, so we should probably get some C code in for doing this in a generic way first, as dicussed in #2778

Metadata

Metadata

Labels

PerformanceThis issue addresses performance, either runtime or memory

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions