Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: allow filtering llama server response fields #10940

Merged
merged 7 commits into from
Dec 24, 2024

Conversation

nvrxq
Copy link
Contributor

@nvrxq nvrxq commented Dec 21, 2024

This adds a new feature. Allow Filtering LLama Server Response Fields from #10819.
Usage:
Curl:

curl --request POST \
    --url http://localhost:8080/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128, "response_fields": ["content", "generation_settings/n_predict"]}'

Response:

{"content":" 1. Choose a domain name...","generation_settings/n_predict":128}
Retrievals from multiple nested lists are also supported, example:
Given the original data: {"key1" : {"key2": 3}}
With response_fields = ["key1/key2"]
Then, the final response will be: {"key1/key2": 3}.

If response does not contain a path, we will return empty json for path.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also include a guide in the /completion endpoint section of server's documentation.

Adding test case is optional, but is highly recommended. See an example in examples/server/tests/unit/test_completion.py

examples/server/utils.hpp Outdated Show resolved Hide resolved
examples/server/utils.hpp Outdated Show resolved Hide resolved
@github-actions github-actions bot added the python python script changes label Dec 22, 2024
examples/server/README.md Outdated Show resolved Hide resolved
@ngxson ngxson merged commit 09fe2e7 into ggerganov:master Dec 24, 2024
50 checks passed
@isaac-mcfadyen
Copy link
Contributor

isaac-mcfadyen commented Dec 24, 2024

When filtering nested response fields, is this new option un-nesting them?

In the example provided:

curl --request POST \
    --url http://localhost:8080/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128, "response_fields": ["content", "generation_settings/n_predict"]}'
{"content":" 1. Choose a domain name...","generation_settings/n_predict":128}

It seems like it's changing the previous format of {"generation_settings": {"n_predict": 128}} (nested) to {"generation_settings/n_predict": 128} (un-nested, with the key having the slash instead).

Is this intended or am I misunderstanding somewhere?

If it's intended it should probably be mentioned in the docs for the server, since it's a bit confusing behavior for a "filter" to actually change the response format IMO.

@nvrxq
Copy link
Contributor Author

nvrxq commented Dec 27, 2024

I've thought about it, but again haven't come to any sort of consensus.
If the user in the conditional code will call response_fields, then in the script where we return {“generation_settings”: {“n_predict”: 128}} he would first have to do a split on response_fields (“/”), and then fetch the values.

@ngxson
Copy link
Collaborator

ngxson commented Dec 27, 2024

I don't have any preference either.

In a logical sense, if we only need pick some fields inside generation_settings, then you need to modify program's code anyway. So it's trivial to change your code to read generation_settings/n_predict instead of generation_settings["n_predict"]

Also I'm not sure if many users gonna use this functionality, so probably not urgent to change anything right now. /completion is non-OAI-compat, not everything is perfectly optimized. Let's wait a bit more and see if it's worth changing.

@isaac-mcfadyen
Copy link
Contributor

isaac-mcfadyen commented Dec 27, 2024

Let's wait a bit more and see if it's worth changing

Alright, sounds good. I might create a PR just to add a quick docs update in case anyone runs into this when using this field if that works.

EDIT: Opened #10995

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants