-
Notifications
You must be signed in to change notification settings - Fork 76
Efficient seeking to trees. #2661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2661 +/- ##
==========================================
+ Coverage 89.82% 89.83% +0.01%
==========================================
Files 29 30 +1
Lines 28514 28612 +98
Branches 5561 5587 +26
==========================================
+ Hits 25613 25704 +91
- Misses 1653 1654 +1
- Partials 1248 1254 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
test_trees segfaults when updating samples lists. I'm having trouble tracking down what I've broken. |
Note to self: I may be not considering the direction of iteration during setup. |
Ahhh, now I get it! In big O terms, this PR seeks to a random tree in O(num_edges + num_samples), since we iterate over all edges in order to pull out those that intersect with the tree's interval, and then build that tree. We don't incur a But, using the standard code I think the complexity is The O(num_edges) cost is the problem, and is what we discussed in #684. Without something that's O(log(num_edges) + num_samples) for seeking to an arbitrary tree, I'm not all that enthusiastic about the extra complexity, as I think it'll be quite hard to figure out how to tune the current approach. What sort of performance difference are you observing in your Rust experiments @molpopgen? |
I see a general 5x-ish improvement in tree access times unless the trees being sought are those at the edges of the tree sequence. Edited: 10x is the max gain. It will be closer to 5x on average. |
@jeromekelleher -- this is a benchmark result that I uploaded to the tskit Slack way back when. |
@jeromekelleher -- here is another one from Slack, comparing to the current seek method for a larger tree sequence. Circles are the current tskit, purple diamonds are my early experiments. This shows that the benefit does vary depending on what the tree index is, but the new method is better except at the edges. Also, the recent Python prototype is similar in being about an order of magnitude faster for an arbitrary seek (again, except for the trees at the ends). |
@jeromekelleher -- I'll save on uploading more images, but I seem to be iterating towards something w/my rust prototype that is always faster than the current implementation, at least for the first half of the trees ;) |
Perhaps you know this, but I got curious, and the problem we want to solve here is the "interval stabbing problem"; here is an approach that is |
Thanks for the plots @molpopgen, these are really helpful. I've no doubt we could do better at the "starting from NULL seek to the kth tree" problem (which is what we're timing here I think, since that's interesting from the parallel computation viewpoint). We probably don't need to store the additional three |
Something like this: def seek(self, position):
"""
Seek to the tree at the position.
"""
sequence_length = self.sequence_length
M = self.edges_left.shape[0]
in_order = self.edge_insertion_order
out_order = self.edge_removal_order
edges_left = self.edges_left
edges_right = self.edges_right
edges_parent = self.edges_parent
edges_child = self.edges_child
j = 0
k = 0
left = 0
right = -1
while j < M or left <= sequence_length:
while k < M and edges_right[out_order[k]] == left:
k += 1
while j < M and edges_left[in_order[j]] == left:
if edges_left[in_order[j]] <= position < edges_right[in_order[j]]:
p = edges_parent[in_order[j]]
c = edges_child[in_order[j]]
self.insert_edge(p, c)
j += 1
right = sequence_length
if j < M:
right = min(right, edges_left[in_order[j]])
if k < M:
right = min(right, edges_right[out_order[k]])
if left <= position < right:
# We're in the target tree
break
left = right
# In the C code we'd now make sure to set the various counters so that we can
# advance in either direction based on j and k A quick numba version I just knocked up on some arbitrary simulations was 2X faster than the |
Here are some raw timings. I went nuts and rewrote all this in rust, but I was smarter this time and just made The two methods can be made very close. The indexing approach has what looks like a 2x advantage over what @jeromekelleher posted. Some details:
|
I have managed to get rid of the positive slope in my last graphic for the indexed method. Working on it for the "jk" approach but I may be out of smart pills/time for the day. |
Okay, I have a working prototype and will do a do-over on this PR. I managed to speed up @jeromekelleher's method by quite a bit -- you can do the same thing w/less logic. The scale of this graph makes it hard to see, but the |
78c05a8
to
9243381
Compare
@jeromekelleher -- I pushed a completely new version. It'd be helpful for you to have a look. I haven't dug into figure out why the failing tests fail. |
Note to self: the final test is failing b/c something is broken with respect to advancing trees. I can reproduce this in my other prototype. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, but I don't think it's correct at the moment. Seeking from either end won't make any difference I think, so I'd just go in one direction to make things simpler.
You need to reason about the edge_removal index also, that's currently missing here.
@jeromekelleher -- I'm quite puzzled now. If you look at the last commits, I am comparing what I think the tree is to what tskit generates. For some reason, sample tracking seems to not be working when seeking to a tree using the new method. |
I think that this is due to the semantics of setting up tracked samples -- the flag is not processed during tree initialization. |
c7af485
to
abcc4e8
Compare
I think this one is close to ready. |
76a95ff
to
bc50233
Compare
Fixing that segfault may be fairly straightforward, it just wasn't obvious to me when I looked at it last night how it could happen - the code in question looked watertight to me. |
19d2b90
to
c2a4b47
Compare
@jeromekelleher -- the valgrind issue and segfaults are due to undefined behavior introduced here (line 4636 in case it isn't clear from the link). |
Doh, was that my fault? Sorry! |
So we're back to a situation where all the tests are passing, etc.. Shall I squash this one? |
Yes please! |
a0dbc4e
to
6763118
Compare
AFAICT, this looks good to go? |
Yep, looks good, just needs a few admin things like updating changelogs, linking to issues and some docs. Leave it with me. I just added a commit to use the c seek_index implementation. |
"jupyter-book not found" 😆 |
8fd1913
to
acbeddd
Compare
Opened #2740 to track updating the docs. This is ready to go now once we get CI working again. |
acbeddd
to
d62b592
Compare
@Mergifyio rebase |
* Add seeking to a tree by index. Closes tskit-dev#2659
✅ Branch has been successfully rebased |
d62b592
to
865caf5
Compare
Adds tree indexing to tsk_treeseq_t.
The indexes allow efficient advancing to arbitrary trees.
Fixes #2659