-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use same size for initial capacity of Builder in transformations #3
base: main
Are you sure you want to change the base?
Use same size for initial capacity of Builder in transformations #3
Conversation
Hi @Nillerr , thanks so much for your suggestion! I thought a lot about whether to use some sort of initial capacity estimate for the builders when the resulting size isn't known in advance. There are some gotchas that would make the performance worse while also increasing memory consumption:
So overall, I think that most developers would be disappointed if memory consumption increased after swapping read-only lists with Immutable Arrays. My goal is that Immutable Arrays would have the same or lower memory consumption while also improving performance. Having said all that, I think we can approach this from a different angle and perhaps achieve the seemingly impossible result of getting the best of both worlds. Stepping back, the real reason we're trying to avoid re-sizing operations is because each resize has to copy all the current elements into the new larger array. I'm tinkering with builder enhancements that give us the best of both worlds by starting with low capacity and also avoiding any copying when resizing. The idea is for the builder to store a linked list of increasing arrays so that any new capacity would just create another link and leave the current elements as is. Copying from the links into the final array shouldn't be worse as this allows us to copy only a single time instead of each time it's resized. This should also minimize large memory requests as the builder will have the elements split into chunks instead of one much larger array. I haven't submitted any code yet as I'm tinkering with throw-away benchmarks to measure different ideas. Just a heads up that my main focus right now is to improve documentation, add a bunch of missing features that are expected to be commonly used, and after that take a deeper look at improving performance further a few releases down the line. |
The filter method with initial size set allows to remove the builder with its capacity checks completely. That gave around 25% performance gain on my system for the test scenario you use for reference arrays. That's significant. May it is an option to define a size value for what is considered as "small" array and then use different implementations for "small" and "big" arrays. Thats the code i benchmarked:
|
I also tested another 'branchless' version that could be done when there is no builder. The results speak for itself. Thats the code with branchless loop and without builder:
(pls let me know if this is the wrong place for my comments) |
Regarding: Thats some kind of an Unrolled linked list, i talked about here. |
2670d5f
to
6e58cae
Compare
97b830e
to
5689779
Compare
Hi @daniel-rusu!
Great idea! While most Kotlin applications certainly don't need to worry about squeezing out the efficiency of using something like
ImmutableArray
, it is undoubtedly a valuable tool to add to the Kotlin ecosystem. Your implementation is great!This PR changes the use of
Builder()
toBuilder(size)
based on the idea that some of these operations would typically result in arrays of around the same size as the original array, while other operations will never result in an array exceeding the size of the original array, and as such having an initial capacity of the same size means we can avoid repeatedly resizing the array of the builder. E.g. for something likefilter
andmapNotNull
we know that the result is at most an array of the same size as the original array, so by having an initial capacity capable of holding all the elements, we will see at most 2 array allocations.Looking forward to your feedback!