Open
Description
It's useful to have a summary of the average arity of nodes in the trees. Here's a function that computes it:
def nodes_mean_arity(ts):
weighted_arity = np.zeros(ts.num_nodes)
span_in_tree = np.zeros(ts.num_nodes)
for tree in ts.trees():
weighted_arity += tree.num_children_array[:-1] * tree.span
span_in_tree[tree.num_children_array[:-1] > 0] += tree.span
span_in_tree[ts.samples()] = ts.sequence_length
weighted_arity /= span_in_tree
return weighted_arity
We could add this as a function to the TreeSequence? (The naming is chosen to follow the arrays like nodes_time
etc)
We can imagine having different options for weighting, at some point.
To do it really efficiently, we'd need to have to implement in C with an edge-diffs approach, but this implementation is probably quite fast for most purposes.
Although, I wonder if there a way of expressing the operation as a node stat?