-
Notifications
You must be signed in to change notification settings - Fork 92
Open
Description
- Investigate effect of making
SortedSetbalance less frequently (or even moving to RB trees rather than AVL trees) - Even when there is no need to rebalance,
SortedSetsometimes copies more nodes than is necessary. Fix this. -
SortedSet#delete_attraverses the tree twice. Add a method toAVLNodeso it just needs to traverse the tree once. - When a
SortedSetis created with a 1-param block (to derive sort keys), should the sort keys for each node be cached? (In some cases, the computation to derive the sort key might be expensive.) - In
Set#union, check ifotheris aHamster::Setwhich is larger thanself. If so, flip aroundselfandother. (It will be cheaper to build up the result using the larger trie.) - In
Set#intersection, ifotheris significantly smaller thanself, but is not aHamster::Set, then create an emptyTrie, iterate throughother, and add all the elements which are also inselfto the newTrie. - In
Set#subset?andSet#proper_subset?, ifotheris not a::SetorHamster::Set(i.e.#include?may be slow), and ifotheris large, it may be faster to build a temporary::Setout ofotherfirst, and use that for testing inclusion. (Don't bother using aHamster::Setfor this purpose, it will be slower.) Another option would be to iterate throughother, count the number of elements which are inself, and see if the number is equal toself.size. - Likewise in
Set#disjoint?: ifself.size < other.sizebutotheris not a::SetorHamster::Set, testingother.include?(item)may be slow (and may lead to O(n^2) run time). In this case, it may be better to build a temporary::Setfor testing inclusion, or just use the 2nd branch so the inclusion test is done onself, notother. - In
Trie, add methods for bulk insertion and deletion (which copy less nodes than repeated adding/deletion of single elements). Use these for operations likeSet#unionand so on. - Investigate the effect on
Vectorof using an implementation strategy more like thePersistent::Vectorlibrary. That is, allVectortries would always be composed of full, 32-item nodes. Then, eachVectorwould have a "tail" array of 0-31 additional items at the end.Vector#pushwould only have to modify the trie once for every 32 pushes; the rest of the time, it would just be copying the "tail" array. Likewise forVector#pop. If this proves worthwhile, consider having both "head" and "tail" arrays for odd numbers of items at the front and back. This could also makeVector#shiftand#unshiftfast. Retrieving an item from aVectorby index would require checking whether it falls within the "head", the "tail", or the vector trie itself. - If giving
Vectors a "head" and "tail" proves to be a worthwhile and successful strategy, compare its performance as a stack/queue withDeque. IfVectorcan be made almost as efficient asDequefor that purpose, dropDequeand allowVectorto replace it. - Otherwise, if keeping
Dequeis necessary, try to mitigate the nasty worst-case scenario which happens when a largeDequeis built up, and then items are alternately popped from the front and back. (It will keep reversing the list, and reversing it back again.) If, say, we are reversing@rearand moving the items to@front, I would suggest: copy the first 1/3 of the items and keep them on@rear. Reverse the remaining 2/3 and put them on@front. - There is another, less deadly worst-case scenario when retrieving the
#firstor#lastitem from aDequewith all the items in a long@frontor@rear. Perhaps eachDequecan keep the last item in@frontand@rearin instance variables, so#firstand#lastcan always be returned quickly.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels