Description
Moving to the next and previous tree in a sequence is currently implemented in tsk_tree_advance and seeking to a position along the sequence (from the initial state) is done in tsk_tree_seek_from_null .
We can replace this logic with calls to tsk_tree_position_t
, following the patterns used in #2786.
We should do some basic performance testing to check that we haven't introduced any regressions by doing this, say by checking on
- A large msprime simulated tree sequence
- A large chromosome from the unified genealogy trees
- A SARS-Cov-2 tree sequence
Note that we didn't implement seek_backwards
in #2786, as I thought this would be a good exercise for someone learning how all this stuff works. The first part of closing this issue would be a commit that implements seek_backwards in Python, and adds some more testing to ensure it's correct.
Note that this doesn't address the issues in #2792, but that can be done as a follow-up, since we're not going to make performance any worse by doing it this way.
We would continue to implement seek
by iteration for the non-null case, because we're less sure about the performance implications of this, and we'd need some additional testing to verify that it's correct.
@duncanMR - are you up for this?