Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : better security control for public deployments #9776

Merged
merged 6 commits into from
Oct 8, 2024

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Oct 7, 2024

Server REST API breaking changes

  • /slots endpoint is now disabled by default, start server with --slots to enable it
  • If an API key is set, all endpoints (including /slots and /props) requires a correct API key to access.
    Note: Only /health and /models are always publicly accessible
  • Setting "system_prompt" is removed from /completions endpoint. It is now moved to POST /props (see documentation)

Please note that GET /props is always enabled to avoid breaking the web UI.

Why?

At the beginning, many functionalities of the server are exposed by default to make the development / testing process easier. It is supposed to be used locally, or in a private network (such as docker).

However, as more and more people want to use llama.cpp server in a production environment, letting everything exposed by default is quite risky, specially when most users only know and use OAI-compat endpoints.

The solution in the PR is to disable "advanced" functionalities by default, and let the users choice to enable them when they want. For example, /slots is now disabled by default, user must explicitly add --slots argument to enable it.


@github-actions github-actions bot added the python python script changes label Oct 8, 2024
@ngxson ngxson merged commit 458367a into ggerganov:master Oct 8, 2024
54 checks passed
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* server : more explicit endpoint access settings

* protect /props endpoint

* fix tests

* update server docs

* fix typo

* fix tests
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* server : more explicit endpoint access settings

* protect /props endpoint

* fix tests

* update server docs

* fix typo

* fix tests
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* server : more explicit endpoint access settings

* protect /props endpoint

* fix tests

* update server docs

* fix typo

* fix tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants