Skip to content

How to use RTree as a set? #82

@akriegman

Description

@akriegman

I am trying to do the following operations on an RTree so that I can use it as a set:

  • Insert a point if not already present
  • merge (union) two RTrees without duplicates

For the first operation this can be done with existing methods on RTree:

if !tree.contains(&p) { tree.insert(p); }

However, while I don't know the internals of RTree I'm guessing this approach traverses the tree twice when it's only necessary to traverse it once. Am I right that we could do this operation more efficiently if we worked with the internals of RTree, for example adding an insert_unique method? The above approach is probably good enough for my use case anyways.

The second operation is a little harder. Here is my code for merging two trees if I didn't care about duplicates:

let tree = RTree::bulk_load(
    left.iter()
    .chain(right.iter())
    .copied()
    .collect()
);

From there I guess I could iterate over the tree checking whether each element equals it's second nearest neighbor to remove duplicates, but this feels hacky. I'd imagine that if we had some sort of insert_unique method, then we could rewrite bulk_load using insert_unique in place of insert to get a method that makes a tree without duplicates.

Another alternative is I can just accept duplicates in my set. Then in order to pop / remove an element I would have to remove all instances like this:

while let Some(_) = tree.remove(&p) {}

which wouldn't be too costly if there's only two or three of each element at most.

So I think my options for using RTree as a set are:

  • Continue hacking together set operations using the exposed API on RTree, probably getting a half or a third the performance of what's possible.
  • Work with the internals of RTree to do these operations more efficiently. I could either structure this as a wrapper struct around RTree in my own crate, or contribute this to rstar. I'd imagine I'd have to be the one to contribute this to rstar, since I think this use case is not a very common one.

So what are people's thoughts? Is anyone else interested in having an RTreeSet in rstar? How much worse will my performance be if I continue using the hacky lazy approach?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions