Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] tool calling support for ibm-granite/granite-20b-functioncalling #8339

Merged
merged 35 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
58e468d
initial commit
wseaton Sep 10, 2024
d4cc66b
remove original part of template
wseaton Sep 10, 2024
742704f
clean up debug logging
wseaton Sep 10, 2024
410ff88
update docs; raise not implemented
wseaton Sep 10, 2024
a5e9a1f
fix lints
wseaton Sep 10, 2024
3d28b6d
sort imports
wseaton Sep 10, 2024
74c8cc7
yapf fixes
wseaton Sep 10, 2024
23a4ca3
another format change
wseaton Sep 10, 2024
1659236
update example prompt to be conversational instead of single turn
wseaton Sep 10, 2024
b1e09a8
update docs for template; link paper
wseaton Sep 10, 2024
e82b2a6
Merge remote-tracking branch 'upstream/main' into granite-fc
wseaton Sep 27, 2024
6b0eebb
add granite to test config
wseaton Sep 27, 2024
346d554
fixup json
wseaton Sep 27, 2024
24e49b8
Add stream support for Granite 20b Tool Use
maxdebayser Sep 27, 2024
86dead8
fix docs
maxdebayser Sep 27, 2024
113fbb6
more robust whispace handling
maxdebayser Sep 28, 2024
acecb6d
remove reference to defunct granite parser
wseaton Oct 2, 2024
86e8466
remove old template
wseaton Oct 2, 2024
43c8078
Update tests/tool_use/utils.py to remove dupe
wseaton Oct 7, 2024
6bf4a41
Merge remote-tracking branch 'upstream/main' into granite-fc
wseaton Oct 7, 2024
2e969c7
fix double import
wseaton Oct 7, 2024
e18219c
add completion request arg to abstract method
wseaton Oct 7, 2024
9a0321b
formatting fixes
wseaton Oct 7, 2024
078ab85
import sorts
wseaton Oct 7, 2024
0a031bf
appease yapf
wseaton Oct 7, 2024
c6a6b56
Apply suggestions from code review
wseaton Oct 16, 2024
defed52
remove redudant indents; add type hints to utils
wseaton Oct 17, 2024
5b78cea
Merge branch 'granite-fc' of github.com:wseaton/vllm into granite-fc
wseaton Oct 17, 2024
2d3b8fe
formatting churn
wseaton Oct 17, 2024
fe13b72
Merge branch 'main' into granite-fc
wseaton Oct 21, 2024
84e93bf
change to old style type aliasing
wseaton Oct 21, 2024
ae55760
Merge branch 'granite-fc' of github.com:wseaton/vllm into granite-fc
wseaton Oct 21, 2024
1277f0b
Doc reformat, add back missing line
wseaton Oct 25, 2024
738d003
Temporarily disable the granite20b-fc test task
wseaton Oct 25, 2024
a6e1bf9
Merge branch 'vllm-project:main' into granite-fc
wseaton Oct 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions docs/source/serving/openai_compatible_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ vLLM will use guided decoding to ensure the response matches the tool parameter
To enable this feature, you should set the following flags:
* `--enable-auto-tool-choice` -- **mandatory** Auto tool choice. tells vLLM that you want to enable the model to generate its own tool calls when it
deems appropriate.
* `--tool-call-parser` -- select the tool parser to use - currently either `hermes`, `mistral` or `llama3_json`. Additional tool parsers
* `--tool-call-parser` -- select the tool parser to use - currently either `hermes`, `mistral`, `llama3_json` or `granite-20b-fc`. Additional tool parsers
will continue to be added in the future.
* `--chat-template` -- **optional** for auto tool choice. the path to the chat template which handles `tool`-role messages and `assistant`-role messages
that contain previously generated tool calls. Hermes, Mistral and Llama models have tool-compatible chat templates in their
Expand All @@ -167,7 +167,9 @@ from HuggingFace; and you can find an example of this in a `tokenizer_config.jso

If your favorite tool-calling model is not supported, please feel free to contribute a parser & tool use chat template!

#### Hermes Models
### Supported Models

#### Hermes
All Nous Research Hermes-series models newer than Hermes 2 Pro should be supported.
* `NousResearch/Hermes-2-Pro-*`
* `NousResearch/Hermes-2-Theta-*`
Expand All @@ -179,7 +181,7 @@ step in their creation_.

Flags: `--tool-call-parser hermes`

#### Mistral Models
#### Mistral
Supported models:
* `mistralai/Mistral-7B-Instruct-v0.3` (confirmed)
* Additional mistral function-calling models are compatible as well.
Expand All @@ -198,6 +200,7 @@ when tools are provided, that results in much better reliability when working wi

Recommended flags: `--tool-call-parser mistral --chat-template examples/tool_chat_template_mistral_parallel.jinja`


#### Llama Models
Supported models:
* `meta-llama/Meta-Llama-3.1-8B-Instruct`
Expand All @@ -218,4 +221,10 @@ it works better with vLLM.

Recommended flags: `--tool-call-parser llama3_json --chat-template examples/tool_chat_template_llama3_json.jinja`

#### IBM Granite

Supported models:
* `ibm-granite/granite-20b-functioncalling`

Flags: `--tool-call-parser granite-20b-fc`
`examples/tool_chat_template_granite_20b_fc.jinja`: this is a modified chat template from the original on Huggingface, which is not vLLM compatible. It blends function description elements from the Hermes template and follows the same system prompt as "Response Generation" mode from [the paper](https://arxiv.org/abs/2407.00121). Parallel function calls are supported.
wseaton marked this conversation as resolved.
Show resolved Hide resolved
130 changes: 130 additions & 0 deletions examples/tool_chat_template_granite_20b_fc.jinja
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
{%- macro json_to_python_type(json_spec) %}
{%- set basic_type_map = {
"string": "str",
"number": "float",
"integer": "int",
"boolean": "bool"
} %}

{%- if basic_type_map[json_spec.type] is defined %}
{{- basic_type_map[json_spec.type] }}
{%- elif json_spec.type == "array" %}
{{- "list[" + json_to_python_type(json_spec|items) + "]" }}
{%- elif json_spec.type == "object" %}
{%- if json_spec.additionalProperties is defined %}
{{- "dict[str, " + json_to_python_type(json_spec.additionalProperties) + ']' }}
{%- else %}
{{- "dict" }}
{%- endif %}
{%- elif json_spec.type is iterable %}
{{- "Union[" }}
{%- for t in json_spec.type %}
{{- json_to_python_type({"type": t}) }}
{%- if not loop.last %}
{{- "," }}
{%- endif %}
{%- endfor %}
{{- "]" }}
{%- else %}
{{- "Any" }}
{%- endif %}
{%- endmacro %}

{%- if not full_function_description is defined %}
{%- set full_function_description = false %}
{%- endif %}

{%- macro full_description(tool) %}
{{- tool.name + '(' }}
{%- if tool.parameters is defined %}
{%- for param_name, param_fields in tool.parameters.properties|items %}
{{- param_name + ": " + json_to_python_type(param_fields) }}
{%- if not loop.last %}
{{- ", " }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- ")" }}
{%- if tool.return is defined %}
{{- " -> " + json_to_python_type(tool.return) }}
{%- endif %}
{{- " - " + tool.description + "\n\n" }}
{%- if tool.parameters is defined %}
{%- for param_name, param_fields in tool.parameters.properties|items %}
{%- if loop.first %}
{{- " Args:\n" }}
{%- endif %}
{{- " " + param_name + "(" + json_to_python_type(param_fields) + "): " + param_fields.description|trim }}
{%- endfor %}
{%- endif %}
{%- if tool.return is defined and tool.return.description is defined %}
{{- "\n Returns:\n " + tool.return.description }}
{%- endif %}
{{- '"' }}
{%- endmacro %}

{%- macro simple_description(tool) %}
{{- tool.description }}
{%- endmacro %}

{%- macro function_description(tool) %}
{%- if full_function_description %}
{{- full_description(tool) }}
{%- else %}
{{- simple_description(tool) }}
{%- endif %}
{%- endmacro %}

{%- if messages[0]["role"] == "system" %}
{%- set sys_prompt = messages[0]["content"] %}
{%- set loop_messages = messages[1:] %}
{%- else %}
{%- set loop_messages = messages %}
{% set sys_prompt = 'You are a helpful assistant with access to the following function calls. Your task is to understand the given conversation with function calls and responses and generate natural language response as the ASSISTANT to continue the conversation. You may use the following function calls to understand how to respond to the user query.' %}
{%- endif %}

{{ 'SYSTEM: ' + sys_prompt }}
{% if tools is iterable and tools | length > 0 %}
<|function_call_library|>
{%- for tool in tools %}
{%- if tool.function is defined %}
{%- set tool = tool.function %}
{%- endif %}
{{- '{"name": "' + tool.name + '", ' }}
{{- '"description": "' + function_description(tool) }}
{{- ', "parameters": ' }}
{%- if not tool.parameters is defined or tool.parameters.properties | length == 0 %}
{{- "{}" }}
{%- else %}
{{- tool.parameters|tojson }}
{%- endif %}
{{- "}" }}
{%- if not loop.last %}
{{- "\n" }}
{%- endif %}
{%- endfor %}
If none of the functions are relevant or the given question lacks the parameters required by the function, please output \"<function_call> {\"name\": \"no_function\", \"arguments\": {}}\".
{%- endif %}



{% for message in messages %}
{% if message['role'] == 'user' %}
{{- '\nUSER: ' + message['content'] }}
{% elif message['role'] == 'assistant' and message.tool_calls is defined %}
{{- '\nASSISTANT:' }}
{% for tc in message.tool_calls %}
{{- '<function_call> ' + {'name': tc.function.name, 'arguments': tc.function.arguments}|tojson }}
{% endfor %}
{{- '<|endoftext|>' }}
{% elif message['role'] == 'assistant' %}
{{- '\nASSISTANT: ' + message['content'] + ' <|endoftext|>' }}
{% elif message['role'] == 'tool' %}
{{- '<function_response> ' + message['content'] }}
{%- else %}
{{- raise_exception("Unexpected combination of role and message content") }}
{% endif %}
{% if loop.last and add_generation_prompt %}
{{- '\nASSISTANT: ' }}
{% endif %}
{% endfor %}
16 changes: 16 additions & 0 deletions tests/tool_use/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,14 @@ def ensure_system_prompt(messages: List[Dict[str, Any]],
"supports_parallel":
False,
},
"granite": {
"model":
"ibm-granite/granite-20b-functioncalling",
"arguments": [
"--tool-call-parser", "granite", "--chat-template",
str(VLLM_PATH / "examples/tool_chat_template_granite.jinja")
],
},
wseaton marked this conversation as resolved.
Show resolved Hide resolved
"mistral": {
"model":
"mistralai/Mistral-7B-Instruct-v0.3",
Expand All @@ -87,6 +95,14 @@ def ensure_system_prompt(messages: List[Dict[str, Any]],
"call the tool. Otherwise, answer the user's query directly "
"without calling a tool. DO NOT CALL A TOOL THAT IS IRRELEVANT "
"to the user's question - just respond to it normally."
},
"granite20b": {
"model":
"ibm-granite/granite-20b-functioncalling",
"arguments": [
"--tool-call-parser", "granite-20b-fc", "--chat-template",
str(VLLM_PATH / "examples/tool_chat_template_granite_20b_fc.jinja")
],
}
}

Expand Down
2 changes: 1 addition & 1 deletion vllm/entrypoints/openai/cli_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ def make_arg_parser(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
parser.add_argument(
"--tool-call-parser",
type=str,
choices=["mistral", "hermes", "llama3_json"],
choices=["mistral", "hermes", "llama3_json", "granite-20b-fc"],
default=None,
help=
"Select the tool call parser depending on the model that you're using."
Expand Down
5 changes: 4 additions & 1 deletion vllm/entrypoints/openai/serving_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@
OpenAIServing,
PromptAdapterPath,
TextTokensPrompt)
from vllm.entrypoints.openai.tool_parsers import (Hermes2ProToolParser,
from vllm.entrypoints.openai.tool_parsers import (Granite20bFCToolParser,
Hermes2ProToolParser,
Llama3JsonToolParser,
MistralToolParser,
ToolParser)
Expand Down Expand Up @@ -88,6 +89,8 @@ def __init__(self,
self.tool_parser = Hermes2ProToolParser
elif tool_parser == "llama3_json":
self.tool_parser = Llama3JsonToolParser
elif tool_parser == "granite-20b-fc":
self.tool_parser = Granite20bFCToolParser
else:
raise TypeError("Error: --enable-auto-tool-choice requires "
"--tool-call-parser")
Expand Down
3 changes: 2 additions & 1 deletion vllm/entrypoints/openai/tool_parsers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
from .abstract_tool_parser import ToolParser
from .granite_20b_fc_tool_parser import Granite20bFCToolParser
from .hermes_tool_parser import Hermes2ProToolParser
from .llama_tool_parser import Llama3JsonToolParser
from .mistral_tool_parser import MistralToolParser

__all__ = [
"ToolParser", "Hermes2ProToolParser", "MistralToolParser",
"Llama3JsonToolParser"
"Granite20bFCToolParser", "Llama3JsonToolParser"
]
Loading
Loading