Skip to content

Tools still giving EoF errors on generated JSON #2310

Closed
@ArjunBhalla98

Description

@ArjunBhalla98

System Info

System Info
Privately hosted instance of TGI
Version: 2.2.0

Deployed as a standalone kserve predictor
Model: Mixtral-8x7b-instruct, also llama3-1-70b-instruct (the same prompts are not failing on both, but the error types are the same. The errors below are using mixtral).
GPU: A100

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

{"model": "mixtral-8x7b-instruct-completion", "messages": [{"role": "user", "content": "Could you track down the latitude and longitude for this IP address I'm concerned about? It's 172.16.254.1. I've been monitoring the network and this one's been popping up with some strange activity. Note that the provided function is in Python. provide a JSON"}], "temperature": 0.1, "max_tokens": 250, "stream": false, "echo": false, "tools": [{"type": "function", "function": {"name": "get_coordinate_by_ip_address", "description": "Finds the latitude and longitude of an IP address.", "parameters": {"type": "object", "properties": {"ip_address": {"type": "string", "description": "The IP address to find the location of."}}, "required": ["ip_address"]}}}], "tool_choice": "auto"}

b'{"error":"Tool error: invalid escape at line 3 column 15","error_type":"tool_error"}'
{'model': 'mixtral-8x7b-instruct', 'messages': [{'role': 'user', 'content': "To better understand the volatility and risk associated with this particular stock, I need to calculate the standard deviation of its daily closing prices over the past 10 trading days. Here are the figures I've gathered: 1000, 2000, 3000, 4000, 5000, 7000, 9000, 15000, 20000, and 30000. Can you provide me with the standard deviation for these closing prices?\n Note that the provided function is in Python. provide a JSON"}], 'tools': [{'type': 'function', 'function': {'name': 'calculate_standard_deviation', 'description': 'Calculates the standard deviation of a list of numbers.', 'parameters': {'type': 'dict', 'properties': {'numbers': {'type': 'array', 'items': {'type': 'float'}, 'description': 'The list of numbers.'}}, 'required': ['numbers']}}}], 'tool_choice': 'auto', 'temperature': 0.7, 'top_p': 0.99, 'max_tokens': 1200}
RESPONSE
b'{"error":"Request failed during generation: Server error: CANCELLED","error_type":"generation"}'

Stack trace:

ERROR text_generation_launcher: Method Prefill encountered an error.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/interegular/patterns.py", line 486, in parse
    return super(_ParsePattern, self).parse()
  File "/opt/conda/lib/python3.10/site-packages/interegular/utils/simple_parser.py", line 63, in parse
    raise NoMatch(self.data, max(self._expected), self._expected[max(self._expected)])
interegular.utils.simple_parser.NoMatch: Can not match at index 858. Got '))?[\\', expected any of ['*', '+', '?', '{', '*', '+', '?', '{', '(', '[', '\\', '.', '$', '^', "<Any 1 except ('.', '?', '\\\\', '(', ')', '|', '*', '[', '^', '$', '+')>", '|'].
Context(data[-10:+10]): '*"[\\n ]*\\}))?[\\n ]*\\'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
  File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
    return _main(
  File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
 File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 106, in serve
    server.serve(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 297, in serve
    asyncio.run(
  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
    self.run_forever()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    self._run_once()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
    handle._run()
  File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/opt/conda/lib/python3.10/site-packages/grpc_interceptor/server.py", line 165, in invoke_intercept_method
    return await self.intercept(
> File "/opt/conda/lib/python3.10/site-packages/text_generation_server/interceptor.py", line 21, in intercept
    return await response
  File "/opt/conda/lib/python3.10/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 120, in _unary_interceptor
    raise error
  File "/opt/conda/lib/python3.10/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 111, in _unary_interceptor
    return await behavior(request_or_iterator, context)
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 145, in Prefill
    batch = self.model.batch_type.from_pb(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_causal_lm.py", line 442, in from_pb
    return cls.from_tokenized(pb, tokenizer, batch_tokenized_inputs, dtype, device)
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_causal_lm.py", line 322, in from_tokenized
    next_token_chooser = HeterogeneousNextTokenChooser.from_pb(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/tokens.py", line 486, in from_pb
    return HeterogeneousNextTokenChooser(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/tokens.py", line 284, in __init__
    HeterogeneousGrammarLogitProcessor(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/logits_process.py", line 570, in __init__
    fsm = GrammarLogitProcessor._cached_compile_fsm(
  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/logits_process.py", line 527, in _cached_compile_fsm
    fsm = RegexFSM(schema, tokenizer)
  File "/opt/conda/lib/python3.10/site-packages/outlines/fsm/fsm.py", line 121, in __init__
    self.states_to_token_maps, self.empty_token_ids = create_states_mapping(
  File "/opt/conda/lib/python3.10/site-packages/outlines/caching.py", line 74, in wrapper
    result = cached_function(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/outlines/fsm/fsm.py", line 102, in create_states_mapping
    regex_pattern = interegular.parse_pattern(regex_string)
  File "/opt/conda/lib/python3.10/site-packages/interegular/patterns.py", line 730, in parse_pattern
    out = p.parse()
  File "/opt/conda/lib/python3.10/site-packages/interegular/utils/simple_parser.py", line 38, in w
    return m(self, *args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/interegular/patterns.py", line 488, in parse
    raise InvalidSyntax
interegular.patterns.InvalidSyntax

This is the same stack trace as in here: #2240 . This was fairly consistent to reproduce, though this stack trace does not always appear in our server logs.

Expected behavior

A valid response -- e.g.,

{"model": "mixtral-8x7b-instruct-completion", "messages": [{"role": "user", "content": "I'm working on a report about a basketball player's average performance throughout the season. The data I have includes the points they scored in each game: 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160. To complete my analysis, I need to calculate the mean score per game. Can you help me with that? Please return your answer as a JSON"}], "temperature": 0.1, "max_tokens": 250, "stream": false, "echo": false, "tools": [{"type": "function", "function": {"name": "calculate_mean", "description": "Calculates the mean of a list of numbers.", "parameters": {"type": "object", "properties": {"numbers": {"type": "array", "items": {"type": "number"}, "description": "The list of numbers."}}, "required": ["numbers"]}}}], "tool_choice": "auto"}

b'{"object":"chat.completion","id":"","created":1721938095,"model":"/mnt/models","system_fingerprint":"2.2.0-sha-db7e043","choices":[{"index":0,"message":{"role":"assistant","tool_calls":[{"id":"0","type":"function","function":{"description":null,"name":"mean","arguments":{"numbers":[15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160]}}}]},"logprobs":null,"finish_reason":"eos_token"}],"usage":{"prompt_tokens":202,"completion_tokens":157,"total_tokens":359}}'

It is a little cryptic as to why the other responses are failing. We would love to be able to see the output of the model regardless if possible, as this would provide a better experience for downstream users (also it would be nice to not have the server crash every so often when this occurs). We did some experimentation, and found that:

  • Changing the temperature and changing the model resulted in different payloads getting errors, sometimes fewer and sometimes more. Despite the stack trace mentioning that "Method Prefill encountered an error.", this may suggest that it's to do with the generated text having an issue?
  • We were digging through the TGI codebase and found the call to outlines.fsm, but were unable to reproduce this error at all locally, so we're not sure what exactly is causing the issue still.

Thanks again for helping so quickly with the last issue, we really appreciate it! It definitely solved some of our issues + the 'tool_choice="auto"' one.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions