Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch-error-on-invalid-grammar #2282

Merged
merged 3 commits into from
Jul 29, 2024
Merged

patch-error-on-invalid-grammar #2282

merged 3 commits into from
Jul 29, 2024

Conversation

ErikKaum
Copy link
Member

@ErikKaum ErikKaum commented Jul 23, 2024

What does this PR do?

Implementation

Try + except the schema = build_regex_from_schema(schema) and return None from _cached_compile_fsm if grammar compilation fails.

There might be a better place to do the error handling. This solution in a way creates two errors, like here:

2024-07-23T09:06:27.017813Z  INFO text_generation_router::server: router/src/server.rs:1651: Setting max batch total tokens to 16000
2024-07-23T09:06:27.285053Z  INFO text_generation_router::server: router/src/server.rs:1889: Connected
2024-07-23T09:06:35.377377Z ERROR text_generation_launcher: Error compiling FSM: Could not translate the instance {'abc': 'def'} to a
    regular expression. Make sure it is valid to the JSON Schema specification. If
    it is, please open an issue on the Outlines repository
2024-07-23T09:06:35.489362Z ERROR text_generation_launcher: Method Prefill encountered an error.
Traceback (most recent call last):
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/typer/core.py", line 778, in main
    return _main(
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/typer/core.py", line 216, in _main
    rv = self.invoke(ctx)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/cli.py", line 118, in serve
    server.serve(
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/server.py", line 297, in serve
    asyncio.run(
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
    self.run_forever()
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
    self._run_once()
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
    handle._run()
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/grpc_interceptor/server.py", line 165, in invoke_intercept_method
    return await self.intercept(
> File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/interceptor.py", line 21, in intercept
    return await response
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 120, in _unary_interceptor
    raise error
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 111, in _unary_interceptor
    return await behavior(request_or_iterator, context)
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/server.py", line 149, in Prefill
    generations, next_batch, timings = self.model.generate_token(batch)
  File "/Users/erikkaum/miniconda3/envs/text-generation-inference/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/models/causal_lm.py", line 857, in generate_token
    batch.next_token_choosers[i] = batch.next_token_choosers[i].advance_grammar(
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/utils/tokens.py", line 103, in advance_grammar
    self.fsm_grammar_state = self.grammar_processor.advance(
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/utils/logits_process.py", line 508, in advance
    return GrammarLogitProcessor._advance(
  File "/Users/erikkaum/Documents/text-generation-inference/server/text_generation_server/utils/logits_process.py", line 516, in _advance
    return fsm.next_state(fsm_grammar_state, next_token_id)
AttributeError: 'NoneType' object has no attribute 'next_state'
2024-07-23T09:06:35.489499Z ERROR batch{batch_size=1}:prefill:prefill{id=0 size=1}:prefill{id=0 size=1}: text_generation_client: router/client/src/lib.rs:46: Server error: 'NoneType' object has no attribute 'next_state'
2024-07-23T09:06:35.489926Z ERROR generate{parameters=GenerateParameters { best_of: None, temperature: None, repetition_penalty: None, frequency_penalty: None, top_k: None, top_p: None, typical_p: None, do_sample: false, max_new_tokens: Some(100), return_full_text: None, stop: [], truncate: None, watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: Some(Json(Object {"abc": String("def")})), adapter_id: None }}:generate:generate_stream:schedule:infer:send_error: text_generation_router::infer::v3::scheduler: router/src/infer/v3/scheduler.rs:493: Request failed during generation: Server error: 'NoneType' object has no attribute 'next_state'

But the error from outlines is clear and this doesn't crash the server. So the job gets done.

This is what the result looks like on the client side:

{"error":"Request failed during generation: Server error: 'NoneType' object has no attribute 'next_state'","error_type":"generation"}

@ErikKaum ErikKaum requested a review from drbh July 23, 2024 09:08
@Narsil
Copy link
Collaborator

Narsil commented Jul 23, 2024

Can you provide a triggering error.

Ideally this is handled on the Rust HTTP server.
The python backend is not allowed to fail because it would fail the whole batch.

Silently ignoring the grammar is the worst solution (which we might have to resort to depending on the complexity incurred of making a real check).

@drbh for vis.

@ErikKaum
Copy link
Member Author

Okay, that makes sense 👍 I'll work out a solution where we move the error to the rust server.

@ErikKaum
Copy link
Member Author

In summary:

  • idea was to do some initial validation in the rust layer to ensure that ish-80% of incoming json grammars would pass the compilation to regex
  • unfortunately, I haven't found a good way to do this without a) implementing the logic itself in rust b) taking the risk of failing grammars that actually would have compiled
  • outlines itself doesn't support the full json schema: https://github.com/outlines-dev/outlines/blob/d78041e81f355ad9f83174cc8ac879443df95e76/outlines/fsm/json_schema.py#L148-L156 which also makes it gnarly to make sure that we adhere to their subsection of supported json if they make updates
  • I played with the idea of compiling the python code to a shared lib (through cython) or a standalone executable, it's doable but I wouldn't suggest we go for it now, but here is a repo of a PoC implementation: https://github.com/ErikKaum/guidance-poc

Suggestion: if the json-->regex compilation fails --> allow anything in grammar (silent failure) and print error log to stdout
It's not at all the desired solution but I think the other ones had too many gotchas. Since the we already to json validation at least invalid json sent to the python layer.

How to reproduce behavior, run TGI and send in this request:

curl localhost:3000/generate \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{
        "inputs": "My name is Olivier and I",
        "parameters": {
            "grammar": {"type": "json", "value": {"abc":"def"}}
            }
        }'

@drbh drbh self-assigned this Jul 25, 2024
@ErikKaum
Copy link
Member Author

ErikKaum commented Jul 26, 2024

Updating here some of the offline discussion I've had:

  • PR fix: reject grammars without properties #2309 catches as much as possible in the router/rust layer
  • this is to avoid the python server crashing, even though silent errors are undesirable
  • these two PRs form a short term fix for now

@drbh
Copy link
Collaborator

drbh commented Jul 29, 2024

approving and merging although this path should not fire in any case. This is added as a defensive measure as noted above.

@drbh drbh merged commit 3d7f4f4 into main Jul 29, 2024
9 checks passed
@drbh drbh deleted the patch-invalid-grammar-error branch July 29, 2024 14:09
yuanwu2017 pushed a commit to yuanwu2017/tgi-gaudi that referenced this pull request Sep 26, 2024
* quick fix

* allow silent failure

* explicit todo that this is only short term
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Server crashes when provided with invalid JSON schema
3 participants