[Model] tool calling support for ibm-granite/granite-20b-functioncalling #8339

wseaton · 2024-09-10T15:37:21Z

Add tool calling parser support for ibm-granite/granite-20b-functioncalling.

Also adds an example chat template that is based off of the granite fc paper.

maxdebayser · 2024-09-10T17:33:39Z

examples/tool_chat_template_granite.jinja

+
+
+
+{% set sys_prompt = 'You are a helpful assistant with access to the following function calls. Your task is to produce a sequence of function calls necessary to generate response to the user utterance. Use the following function calls as required.' %}


According to the paper, the system prompt used for the end-to-end scenario where the model has to explain the tool output is

You are a helpful assistant with access to the following function calls. Your task is to understand the given conversation with function calls and responses and generate natural language response as the ASSISTANT to continue the conversation. You may use the following function calls to understand how to respond to the user query.

In my experiments with the model it worked well, so I perhaps it would be a nice default in this example file.

Good callout! In my evals, I have mostly been doing zero-shot single turn calls.

Let me update the example template with the more general conversational one since that is probably what most people want.

njhill · 2024-09-23T23:50:00Z

@wseaton one of the tests is failing and looks most likely related: https://buildkite.com/vllm/fastcheck/builds/3985#0191dd9c-724a-4fbe-99d3-a1e1c94a2106

wseaton · 2024-09-27T16:30:20Z

@maxdebayser are you still interested in contributing the streaming JSON parser for granite support? I have rebased off of main and might have some bandwidth to work on it, just let me know :)

maxdebayser · 2024-09-27T19:19:07Z

Hi @wseaton , yes, I've been working on this. There is only one unit test that isn't passing. But I know what the cause is.

maxdebayser · 2024-09-27T20:00:35Z

@wseaton, now all tests are passing. I've pushed a squashed merge of everything to this draft PR so that you can take a look and perhaps merge with yours: #8915

This commits builds on previous work by Will Eaton and adds support for streaming. It also adds the model to the tool use unit tests. In this commit the tool parser is renamed from simply granite to granite-20b-fc to differentiate from other granite models. Another minor change is that in the chat template the function description using the function signature is now optional. Co-authored-by: Will Eaton <me@wseaton.com> Signed-off-by: Max de Bayser <mbayser@br.ibm.com>

Signed-off-by: Max de Bayser <mbayser@br.ibm.com>

wseaton · 2024-10-02T16:31:48Z

@maxdebayser I've cherry picked the commits from your PR, please review!

maxdebayser

Thanks a lot @wseaton . There is an extra entry on the unit test model dict, but other than that it looks good to me.

cc @njhill

tests/tool_use/utils.py

Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>

wseaton · 2024-10-07T16:35:24Z

@maxdebayser working on fixing merge conflicts w/ upstream now

wseaton · 2024-10-25T14:09:44Z

@njhill Happy to rebuild my fork of vLLM and re-verify the PR manually on an A100, disabled the test for now

maxdebayser · 2024-10-25T16:41:34Z

@wseaton I've verified a couple of hours ago. The tool_use tests run successfully on an A100. I have a quantized version of the model that passes the tests and I'm testing it if it can fit in the CI environment.

njhill · 2024-10-25T20:22:14Z

The test could also be run on a larger GPU, although that's probably a bit wasteful

@maxdebayser yes that's what I thought too, since none of the other model sub-tests require it

njhill · 2024-10-25T20:24:35Z

Thanks @wseaton! Could you try merging in the latest main branch again? That should hopefully help with the failing CI tests.

K-Mistele · 2024-10-26T22:53:28Z

@wseaton just a minor point of clarification, this PR is for Granite 20B functioncalling -- is this the same function calling format as the granite 3.0 models? I saw that these were released recently:

I'm not as familiar with IBM/granite as with some other families of models. if they are compatible, then that's great and we should indicate that in the docs, as well as name the tool parser granite most likely. If not (and this is what I suspect is the case, based on a brief reading of their docs & github code for agentic stuff), then it might be worth adding a call-out in the docs that this is NOT compatible with granite 3.0

maxdebayser · 2024-10-28T12:15:16Z

@K-Mistele , they are not compatible. We have another PR for those models but we're waiting for this one to be merged first because there are some code dependencies.

maxdebayser · 2024-10-28T16:32:39Z

@njhill , @wseaton my tests with a quantized version of the model are now passing. With quantization the model size was reduced to ~20GB (mbayser/granite-20b-functioncalling-FP8-KV), but I still had to add a bunch of flags to reduce memory usage:
"--max_num_seqs", "1", "--max-model-len", "1024", "--enforce-eager", "--cpu-offload-gb", "20"

wseaton · 2024-10-29T14:12:56Z

Because of the nature of this branch and how long it's been running (with many merge commits from main), I don't feel comfortable doing a rebase to fix the DCO issue. Can it be bypassed to get this merged?

K-Mistele · 2024-10-29T15:50:28Z

@K-Mistele , they are not compatible. We have another PR for those models but we're waiting for this one to be merged first because there are some code dependencies.

sounds good! Let me know if you'd like for me to take a look at it whenever it's ready :)

njhill

Thanks @wseaton @maxdebayser for all of the work on this and thanks @K-Mistele for also reviewing.

njhill · 2024-10-29T22:07:08Z

@wseaton I'll merge this to unblock the other granite PR, could consider re-enabling the test with @maxdebayser's quantized model as a follow-on update...

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: Randall Smith <Randall.Smith@amd.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: NickLucche <nlucches@redhat.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: Linkun Chen <github+anyscale@lkchen.net>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: Loc Huynh <jc1da.3011@gmail.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

…ing (vllm-project#8339) Signed-off-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Max de Bayser <mbayser@br.ibm.com> Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>

wseaton force-pushed the granite-fc branch from b726767 to 33912a8 Compare September 10, 2024 15:43

wseaton added 4 commits September 10, 2024 11:44

initial commit

58e468d

remove original part of template

d4cc66b

clean up debug logging

742704f

update docs; raise not implemented

410ff88

wseaton force-pushed the granite-fc branch from 33912a8 to 410ff88 Compare September 10, 2024 15:44

wseaton added 4 commits September 10, 2024 11:53

fix lints

a5e9a1f

sort imports

3d28b6d

yapf fixes

74c8cc7

another format change

23a4ca3

maxdebayser reviewed Sep 10, 2024

View reviewed changes

wseaton added 2 commits September 10, 2024 15:56

update example prompt to be conversational instead of single turn

1659236

update docs for template; link paper

b1e09a8

Merge remote-tracking branch 'upstream/main' into granite-fc

e82b2a6

wseaton added 2 commits September 27, 2024 12:55

add granite to test config

6b0eebb

fixup json

346d554

maxdebayser and others added 5 commits October 2, 2024 12:14

fix docs

86dead8

Signed-off-by: Max de Bayser <mbayser@br.ibm.com>

more robust whispace handling

113fbb6

Signed-off-by: Max de Bayser <mbayser@br.ibm.com>

remove reference to defunct granite parser

acecb6d

remove old template

86e8466

maxdebayser approved these changes Oct 2, 2024

View reviewed changes

tests/tool_use/utils.py Outdated Show resolved Hide resolved

maxdebayser mentioned this pull request Oct 2, 2024

[Frontend] Tool calling parser for Granite 3.0 models #9027

Merged

Update tests/tool_use/utils.py to remove dupe

43c8078

Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>

Temporarily disable the granite20b-fc test task

738d003

Merge branch 'vllm-project:main' into granite-fc

a6e1bf9

mergify bot added documentation Improvements or additions to documentation frontend labels Oct 28, 2024

njhill approved these changes Oct 29, 2024

View reviewed changes

njhill merged commit 882a1ad into vllm-project:main Oct 29, 2024
56 checks passed

wseaton deleted the granite-fc branch October 29, 2024 23:18

russellb mentioned this pull request Nov 7, 2024

CI TEST #9700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] tool calling support for ibm-granite/granite-20b-functioncalling #8339

[Model] tool calling support for ibm-granite/granite-20b-functioncalling #8339

wseaton commented Sep 10, 2024

maxdebayser Sep 10, 2024

wseaton Sep 10, 2024 •

edited

Loading

njhill commented Sep 23, 2024

wseaton commented Sep 27, 2024

maxdebayser commented Sep 27, 2024

maxdebayser commented Sep 27, 2024

wseaton commented Oct 2, 2024

maxdebayser left a comment

wseaton commented Oct 7, 2024

wseaton commented Oct 25, 2024

maxdebayser commented Oct 25, 2024

njhill commented Oct 25, 2024

njhill commented Oct 25, 2024

K-Mistele commented Oct 26, 2024

maxdebayser commented Oct 28, 2024

maxdebayser commented Oct 28, 2024 •

edited

Loading

wseaton commented Oct 29, 2024

K-Mistele commented Oct 29, 2024

njhill left a comment

njhill commented Oct 29, 2024




		{% set sys_prompt = 'You are a helpful assistant with access to the following function calls. Your task is to produce a sequence of function calls necessary to generate response to the user utterance. Use the following function calls as required.' %}

[Model] tool calling support for ibm-granite/granite-20b-functioncalling #8339

[Model] tool calling support for ibm-granite/granite-20b-functioncalling #8339

Conversation

wseaton commented Sep 10, 2024

maxdebayser Sep 10, 2024

Choose a reason for hiding this comment

wseaton Sep 10, 2024 • edited Loading

Choose a reason for hiding this comment

njhill commented Sep 23, 2024

wseaton commented Sep 27, 2024

maxdebayser commented Sep 27, 2024

maxdebayser commented Sep 27, 2024

wseaton commented Oct 2, 2024

maxdebayser left a comment

Choose a reason for hiding this comment

wseaton commented Oct 7, 2024

wseaton commented Oct 25, 2024

maxdebayser commented Oct 25, 2024

njhill commented Oct 25, 2024

njhill commented Oct 25, 2024

K-Mistele commented Oct 26, 2024

maxdebayser commented Oct 28, 2024

maxdebayser commented Oct 28, 2024 • edited Loading

wseaton commented Oct 29, 2024

K-Mistele commented Oct 29, 2024

njhill left a comment

Choose a reason for hiding this comment

njhill commented Oct 29, 2024

wseaton Sep 10, 2024 •

edited

Loading

maxdebayser commented Oct 28, 2024 •

edited

Loading