Skip to content

More performance optimizations #136

@alexdowad

Description

@alexdowad
  • Investigate effect of making SortedSet balance less frequently (or even moving to RB trees rather than AVL trees)
  • Even when there is no need to rebalance, SortedSet sometimes copies more nodes than is necessary. Fix this.
  • SortedSet#delete_at traverses the tree twice. Add a method to AVLNode so it just needs to traverse the tree once.
  • When a SortedSet is created with a 1-param block (to derive sort keys), should the sort keys for each node be cached? (In some cases, the computation to derive the sort key might be expensive.)
  • In Set#union, check if other is a Hamster::Set which is larger than self. If so, flip around self and other. (It will be cheaper to build up the result using the larger trie.)
  • In Set#intersection, if other is significantly smaller than self, but is not a Hamster::Set, then create an empty Trie, iterate through other, and add all the elements which are also in self to the new Trie.
  • In Set#subset? and Set#proper_subset?, if other is not a ::Set or Hamster::Set (i.e. #include? may be slow), and if other is large, it may be faster to build a temporary ::Set out of other first, and use that for testing inclusion. (Don't bother using a Hamster::Set for this purpose, it will be slower.) Another option would be to iterate through other, count the number of elements which are in self, and see if the number is equal to self.size.
  • Likewise in Set#disjoint?: if self.size < other.size but other is not a ::Set or Hamster::Set, testing other.include?(item) may be slow (and may lead to O(n^2) run time). In this case, it may be better to build a temporary ::Set for testing inclusion, or just use the 2nd branch so the inclusion test is done on self, not other.
  • In Trie, add methods for bulk insertion and deletion (which copy less nodes than repeated adding/deletion of single elements). Use these for operations like Set#union and so on.
  • Investigate the effect on Vector of using an implementation strategy more like the Persistent::Vector library. That is, all Vector tries would always be composed of full, 32-item nodes. Then, each Vector would have a "tail" array of 0-31 additional items at the end. Vector#push would only have to modify the trie once for every 32 pushes; the rest of the time, it would just be copying the "tail" array. Likewise for Vector#pop. If this proves worthwhile, consider having both "head" and "tail" arrays for odd numbers of items at the front and back. This could also make Vector#shift and #unshift fast. Retrieving an item from a Vector by index would require checking whether it falls within the "head", the "tail", or the vector trie itself.
  • If giving Vectors a "head" and "tail" proves to be a worthwhile and successful strategy, compare its performance as a stack/queue with Deque. If Vector can be made almost as efficient as Deque for that purpose, drop Deque and allow Vector to replace it.
  • Otherwise, if keeping Deque is necessary, try to mitigate the nasty worst-case scenario which happens when a large Deque is built up, and then items are alternately popped from the front and back. (It will keep reversing the list, and reversing it back again.) If, say, we are reversing @rear and moving the items to @front, I would suggest: copy the first 1/3 of the items and keep them on @rear. Reverse the remaining 2/3 and put them on @front.
  • There is another, less deadly worst-case scenario when retrieving the #first or #last item from a Deque with all the items in a long @front or @rear. Perhaps each Deque can keep the last item in @front and @rear in instance variables, so #first and #last can always be returned quickly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions