Description
LocalAI version:
v3.0.0
Environment, CPU architecture, OS, and Version:
Linux 6.5.0-41-generic #41~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC
x86_64 architecture, AMD CPU (16 threads), 64 GB RAM, 50 GB RAM free
Manual FHS-compliant install (not Docker), user localai
Describe the bug
Setting mlock: true
in model configuration does not result in any memory being locked in RAM. No mlock()
or mlockall()
syscall is ever invoked.
The model remains swappable even though mlock
is explicitly requested.
To Reproduce
Model config (llama3-8b-instruct:Q6_K.yaml
):
name: "llama3-8b-instruct:Q6_K"
context_size: 32768
mmap: true
mlock: true
parameters:
model: Meta-Llama-3-8B-Instruct.Q6_K.gguf
n_threads: 16
Systemd service:
[Service]
User=localai
...
LimitMEMLOCK=infinity
The localai
user has unlimited memlock:
sudo -u localai bash -c 'ulimit -l'
# → unlimited
Started manually with:
sudo -u localai strace -f -e mlock,mlockall /usr/local/bin/local-ai --models-path /mnt/prg/localai/models --threads 16 --debug
Expected behavior
Expected to see a mlock()
or mlockall()
syscall when model is loaded, and a non-zero value in /proc/$(pidof local-ai)/status
under VmLck:
.
Logs
Debug output confirms mlock: true
was parsed:
LLMConfig:{... MMap:0xc056efd4a9 MMlock:0xc056efd6d1 ...}
But VmLck
remains:
VmLck: 0 kB
And strace shows no mlock()
or mlockall()
calls were made.
Additional context
- Running
mlock()
manually aslocalai
works (verified with small C test) - Happens regardless of
mmap: true
ormmap: false
- Behavior confirmed for quantized llama3-8b Q6_K model
- Possibly a regression or unimplemented backend feature?