ggml-alloc: optimize free block shifting with memmove
#17640
+2
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Replaced a manual
forloop responsible for shifting elements in thechunk->free_blocksarray with a single call tomemmove. This change leverages an optimized standard library function for block memory operations, which can be significantly more efficient than a manual loop.memmoveis designed to handle potentially overlapping memory regions correctly and is often implemented using highly optimized assembly instructions (like SIMD) or intrinsic functions, leading to improved performance during free block insertion in the memory allocator.References:
Performance of
memmovevs. manual loops:https://stackoverflow.com/questions/11090176/is-it-faster-to-loop-and-copy-or-use-memmove
(A discussion highlighting why
memmoveis generally faster due to compiler and library optimizations.)Inside
memcpyandmemmoveImplementations:https://nullprogram.com/blog/2023/08/17/
(Explores how
memcpyandmemmoveare often implemented at a low level to achieve high performance.)C Programming Optimizations - General Techniques:
https://www.geeksforgeeks.org/c-programming-optimizations/
(Discusses various C optimization techniques, including the use of efficient standard library functions.)
Memory Allocation Principles and Optimizations:
https://www.cs.cornell.edu/courses/cs3410/2019sp/schedule/L24-MemoryAlloc.pdf
(Lecture slides providing context on memory allocation strategies and where optimizations like this fit in.)
Co-Authored-By: Gemini 2.5 Pro (References and description commit changes)