-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
LocalAI version:
localai/localai:master-aio-cpu (Docker image, pulled on: 2026-02-26)
Environment, CPU architecture, OS, and Version:
CPU info:
model name : AMD Ryzen 5 7530U with Radeon Graphics
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm
CPU: AVX found OK
CPU: AVX2 found OK
CPU: no AVX512 found
- Hardware: x86_64 CPU (8 cores / 16GB RAM), no GPU
- Operating System: Windows 11
- Docker version 29.2.1, build a5c7197
Describe the bug
When attempting to enable the reranking feature, calling the /v1/rerank API returns an RPC error: rpc error: code = Unimplemented desc = This server does not support reranking. Start it with --reranking and without --embedding.
However:
- The latest
localai/localai:latestimage is used, which has removed the--rerankingstartup flag — adding this flag will result in anunknown flag --rerankingerror; - The reranking model has been configured with
known_usecases: [rerank]as per the official documentation, and the model file & configuration file have correct paths/naming; - Automatic model download was disabled at startup, but the error still occurs, and the reranking API remains unavailable.
To Reproduce
Steps to reproduce the behavior:
- Prepare the reranking model file: Place
bge-reranker-v2-m3-Q4_K_M.ggufin the./modelsdirectory; - Create the model configuration file
./models/bge-reranker-v2-m3-Q4_K_M.gguf.yamlwith the following content:
backend: llama-cpp
description: Imported from file:///models/bge-reranker-v2-m3-Q4_K_M.gguf
function:
grammar:
disable: true
known_usecases:
- rerank
name: bge-reranker-v2-m3-Q4_K_M.gguf
options:
- use_jinja:true
parameters:
model: bge-reranker-v2-m3-Q4_K_M.gguf
template:
use_tokenizer_template: true
embeddings: false