Closed
Description
What happened?
Following up on discussion in #8420, the documentation for server.cpp --embeddings is confusing/incorrect.
Perhaps the original usage of the --embedding flag has morphed overtime but documentation does not reflect it.
With that proposing to change the documentation to following
--embedding(s) restrict to only support embedding use case (default: %s). Use only with dedicated embedding models.
Note the llama-server already refuses non-embedding endpoints if the above flag is turned on:
Name and Version
./llama-cli --version
version: 3488 (75af08c)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.5.0
What operating system are you seeing the problem on?
Linux
Relevant log output
No response