-
Notifications
You must be signed in to change notification settings - Fork 23
fix: GPT Streams with tool calls #214
fix: GPT Streams with tool calls #214
Conversation
|
Oh, good one, looks like that happened with #140. So I think your basic idea is right. In if you're not happy with this solution you could try to loop through the iterator until haven't tried it though |
|
not sure if there is a way to move that to |
|
Yeah. Thanks for your input. I am currently digging into some Python Code that is given in the OpenAI discussion rounds. There was another example that it could be possible that there is "text" like "Hey, i am going to search through some functions ... wait a moment" and then function calls are starting but i do not get such stuff with the same prompts currently. Beside the documentation is definitively looking wrong here the streaming with functions seem in general a bit more complicated when the stuff people ping pong in the community is right. But it seems not uncommon to have a combined streaming function and fallback to function call when they appear in the response instead of checking them separatly in the beginning. So integrating it into |
b475dd3 to
6de8d9e
Compare
|
Hey @chr-hertel ... could you have a look to the changes if it fits the idea? And maybe you have also the possibility to execute the associated examples that are not utilizing OpenAI API. Those are running as expected. I have rewritten the new example to execute the function call not with the beginning of the stream but in the middle of it - as i found it as a use case. This is why i have added a tool chain specific stream response that is catching the tool calls from the response converter. The initial stream ends with the appearance of an function call and is replace with the function call response stream afterwards. This is why i have implemented the replacement of the defaul stream response with a toolbox specific to be able to catch the function call in the middle of the stream. I put it to the toolbox namespace as the chain processor which is replacing it is also there. Looks pretty stable currently and looks better then the current solution that was only looking to the initial chunk of the stream, i what do you think? Would this a way to move on? |
|
Ok. One Problem i have to think about is that the stream repeats itself after function call ... sometimes a bit different why i did not had it detected first. But it seem to not remember the already given information before the function call 🤔 |
|
oh wow, wasn't aware of that mixed response of first text and then toolcall - pretty nice - increases the complexity quite brutal but looks like you covered it 👍 the only thing from a functional point that i'm catching is that |
Thanks! I think this is the same issue that i have with double answering my example question for unicorns. I still try to solve this issue that the tool call response to the LLM answers the containing question that was already answered again. Sounds like this is then the same issue with trying to answering the weather question again and again and again even it is answered. So it does not detect that it has already answered this before the tool call appeared. Could have something to do with the copy of the initial message bag to the tool call response. Maybe it needs to get the already generated text as context for the answer. But if it, generally, also works with other GPT providers and the changed code looks not too hard away from your thoughts i am already happy and move on solving this 👍🏼 |
7ab1766 to
3088c47
Compare
|
Hm! Ok. I have fixed the weather loop, this was my error when building the closure for the tool calls. Easy one. So to fix the tool call repeating messages. The call to the tool is not aware of the already given answer, as mentioned before. After research it is "normal" as the model tries to answer the user question with the tool call. Within non streamed answers this is not a problem as the full answer is build at once after the tool call is given but in streamed context it is possible, like within my example, that the LLM already gave content before it finds out that it have to call a tool. So we pay the feeled "performance" boost of answering because the model does not know what will comes next in detail in general. This is what i have interpreted from some OpenAI community discussions. With them i also found out that it is recommended to send the fetched assistant message, received before the tool call, with the tool answer request. It works for now but i think on the long run with more complicated prompts there could be a problem where i did not wanted to implement a BC now. Why a BC? Each tool call only knows the streamed content since the last stream started as the messages from the initial call are cloned in I would suggest, on the long run, to have the tool calls in there, at least for streams. You already mentioned the raised complexity. Maybe on the long run, instead of a WDYT, @chr-hertel ? From my side ... i would say: Done here as a first step with a lot of learnings about streams? And sorry for the tapestry of text 🙈 |
yea, from time to time i struggle a bit with the approach here that i chose to bring all capabilities to one almighty chain - but on the other hand it's call |
chr-hertel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks correct to me and functional tests are fine - also feel the urge to decouple/refactor/split but let's move that out of the scope of this PR
|
Thanks @DZunke! |
Please not merge! The changed code was just a wild guess ... read more below.
I am currently experimenting a bit with the streamed responses from OpenAI gpt. First it worked like a charm but then i wanted to utilize my set of functions. Every request that should utilize those functions is not working anymore. I remembered there was the PR #126 that should have fixed it already. But this seem not to work.
So i built the example
examples/stream-tools-gpt-openai.phpand found out it is not only not working for me but also not working for the library. Do you have any remembering on this topic @chr-hertel ?When i request a streamed response without a function call the first chunk looks like this
When i do a request with a streamed response that should include a function call the OpenAI API delivers the following first chunk.
Within the file
src/Bridge/OpenAI/GPT/ResponseConverter.phpthe check for the function calls currently depends on having atool_callskey within the first chunksdeltabut it is not there. I had a wild guess and found that it works fine with having a check for thenullcontent node in the first chunk as the following chunks do contain all the information about tool calling. But this feels pretty unstable and wrong.Google is currently not really helpful as a lot of people seem to have problems with the streaming of function calls at OpenAI. So, while searching for more information, i thought i throw it to the room. Are there people out here having the same struggle or already found solutions?