Move page cache via mbind to prevent cross-NUMA access #13731
+52
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
page cache pages are retained in memory of the node after running llama-bench bound to a node on multi-node systems, incuring cross-NUMA memory access penalty for subsequent runs of llama-bench bound to a different node. This commit introduces an mbind call as best effort basis to move the pages to the target node where llama-bench is executed, ensuring optimal NUMA locality. Additionally, necessary NUMA headers are included and the build is updated to link against the NUMA library.
Experiments:
+ /usr/bin/time -p numactl -N 1 -m 1 $llama-bench -m $models/llama-2-7b-chat.Q8_0.gguf -ngl 0 --prio 0 -b 1 -t 24
build: 35782ae (5014)
real 687.60
user 15653.73
sys 42.67
+ /usr/bin/time -p numactl -N 0 -m 0 $llama-bench -m $models/llama-2-7b-chat.Q8_0.gguf -ngl 0 --prio 0 -b 1 -t 24
build: 35782ae (5014)
real 805.99
user 18187.26
sys 48.93
+ /usr/bin/time -p numactl -N 0 -m 0 $patched-llama-bench -m $models/llama-2-7b-chat.Q8_0.gguf -ngl 0 --prio 0 -b 1 -t 24
build: 35782ae (5014)
real 696.12
user 15735.41
sys 44.08
Suggested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Signed-off-by: Vishal Chourasia vishalc@linux.ibm.com