Server: fix server hangs on empty prompt #5733

ngxson · 2024-02-26T14:38:30Z

This is a proposal to fix #5724 and #5246

It's buggy when we run a slot with no tokens to evaluate, since some parts of code implicitly expect n_tokens to be > 0

This PR fixes issue when using:

/v1/embeddings with "input": ""
/embedding with "content": ""
/completion with "prompt": ""

phymbert

Can we hardcode a default embeddings answer as OpenAI ? Do you think it deserve a small test scenario ?

ngxson · 2024-02-26T21:47:07Z

I don't think we should hard code the result, as it's vector and not a fixed text value (so we may expect different floating point inaccuracy on different hardwares)

But what we can do for the test is:

Test if embedding works with empty input { "input": "" }, only to see if it outputs a vector or not. We don't care what's inside the vector.
Try to get embedding for one-space { "input": " " } and two-spaces { "input": " " } then compare the euclidean distance of the too vector, you can hard code to check if vector1 != vector2 and distance(vector1, vector2) < THRESHOLD, where THRESHOLD can be hard coded. THRESHOLD can be a bit bigger than it need to be, to compensate the inaccuracy among hardwares.

ibehnam · 2024-02-26T22:30:18Z

@ngxson @phymbert @ggerganov

Can we also fix the issue where incorrect grammar crashes the server? Last I checked there was a boolean check at the top of the server code which validated the input args. I think we could move the grammar validation outside of that and just return None or Error if incorrect grammar is passed.

fix server hangs on empty prompt

20df113

ngxson requested review from ggerganov and phymbert February 26, 2024 14:38

phymbert approved these changes Feb 26, 2024

View reviewed changes

ngxson merged commit b11a93d into ggml-org:master Feb 26, 2024

z80maniac mentioned this pull request Feb 28, 2024

server: error handling #5776

Closed

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

fix server hangs on empty prompt (ggml-org#5733)

f5918cc

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

fix server hangs on empty prompt (ggml-org#5733)

2ace6d9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Server: fix server hangs on empty prompt #5733

Server: fix server hangs on empty prompt #5733

Uh oh!

ngxson commented Feb 26, 2024

Uh oh!

phymbert left a comment

Uh oh!

ngxson commented Feb 26, 2024 •

edited

Loading

Uh oh!

ibehnam commented Feb 26, 2024 •

edited

Loading

Uh oh!

Uh oh!

Server: fix server hangs on empty prompt #5733

Server: fix server hangs on empty prompt #5733

Uh oh!

Conversation

ngxson commented Feb 26, 2024

Uh oh!

phymbert left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson commented Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ibehnam commented Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ngxson commented Feb 26, 2024 •

edited

Loading

ibehnam commented Feb 26, 2024 •

edited

Loading