-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tools still giving EoF errors on generated JSON #2310
Comments
Hi -- just to provide an update on this, we are seeing reasonably frequent occurrences on this, and sometimes when the error in the provided stack trace occurs it seems that the server will crash briefly, obviously causing other traffic to time out. Thanks again for taking a look! |
Also, we have just started seeing this related error (seems sporadic, even with model temp ~0. Sometimes rerunning will get rid of this, sometimes not. Assuming this is because our temp is maybe 0.001 instead of 0 as 0 causes an input validation error from pydantic)
|
Hi @ArjunBhalla98 apologies for the delay in response! I just took a deeper look at the issue and attempted to reproduce the tool failures locally. I believe the issue is mainly due to the LLM generating "valid" text that is not a complete JSON blob. I believe this is happening for three reasons. 1. It's expected that the LLM will fail to generate JSON in rare cases, 2. the prompt supplied may be causing the generation not output the text in a parsable format. 3. One of the tools is not a valid JSON Schema. I've tried both request locally with
from the prompt greaty improved the response and the ability to parse tools. Additionally when debugging the intermediate text it appeared that the value was often escaped (causing the parsing issue). Additionally the second example above uses an invalid data type {
"type": "function",
"function": {
"name": "calculate_standard_deviation",
"description": "Calculates the standard deviation of a list of numbers.",
"parameters": {
"type": "dict",
"properties": {
"numbers": {
"type": "array",
"items": {"type": "number"},
"description": "The list of numbers.",
}
},
"required": ["numbers"],
},
},
} In order to make debugging easier and improve transparency I've just opened a PR that will return the generated text with the error message when it cannot be parsed into valid JSON. This should help provide insight into why a specific request has errored #2353 Also there is another PR in the works that should help improve visibility into how tools are formatted before they are processed by the model #2333. This helps provide insight into how prompts are formatted with tools, which can help with prompt engineering. Would you kindly try changing you're prompt to avoid specific formatting instructions and ensure that tools are valid JSON schemas? Also those two PR's should be merged soon and they should help debug in the future |
Hi @drbh, no problem at all - we really appreciate you looking into this! Thanks for the thorough response. to your points: 1/2. Makes sense! We had guessed that this might have been the issue, especially the escape character bit, as I quadruple checked that our input / payload was valid json.
Very interesting find on the prompt, I will try that ASAP! Most of our code gen prompts have this. Given the nature of the main error, just propagating that intermediate text through / with the error that it was illegal JSON would be incredibly helpful. Thanks again! |
Hi @ArjunBhalla98 I believe these issues are fully resolved by the recent improvements/bug fixes to grammars and tool calling (#2463, #2454, #2391, etc...) TGI will also return text that it fails to parse as of this PR #2353. This is much less likely to happen now, but it should help with debugging if it does. Going to close this issue since these bugs should all be resolved on Thank you! |
System Info
System Info
Privately hosted instance of TGI
Version: 2.2.0
Deployed as a standalone kserve predictor
Model: Mixtral-8x7b-instruct, also llama3-1-70b-instruct (the same prompts are not failing on both, but the error types are the same. The errors below are using mixtral).
GPU: A100
Information
Tasks
Reproduction
Stack trace:
This is the same stack trace as in here: #2240 . This was fairly consistent to reproduce, though this stack trace does not always appear in our server logs.
Expected behavior
A valid response -- e.g.,
It is a little cryptic as to why the other responses are failing. We would love to be able to see the output of the model regardless if possible, as this would provide a better experience for downstream users (also it would be nice to not have the server crash every so often when this occurs). We did some experimentation, and found that:
outlines.fsm
, but were unable to reproduce this error at all locally, so we're not sure what exactly is causing the issue still.Thanks again for helping so quickly with the last issue, we really appreciate it! It definitely solved some of our issues + the 'tool_choice="auto"' one.
The text was updated successfully, but these errors were encountered: