-
Notifications
You must be signed in to change notification settings - Fork 1
Description
For each chunk, vectors are allocated. Example: https://github.com/jamessewell/pgingester/blob/main/src/main.rs#L608
These vectors could be allocated outside the chunk loop and re-used in each iteration.
They would be cleared at the end of each iteration, only keeping their capacity (allocation), but removing their items.
The current implementation deallocates completely at the end of each iteration and reallocates at the start of the next iteration.
To no over-allocate, the maximum size could be determined as a clamped value of batch size and total count, such that only space for total count items is allocated, in case total count is less than batch size.
The improvements are going to depend on the allocator used. Since the default allocator is used (rather than, e.g. mimalloc or jemalloc), I expect the reduced allocations to be noticeable, especially for unnest strategies with small batch sizes.
It might be incorrect to assume the database alone is responsible for the slightly surprising observed results in small batch sizes.